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High Precision Peg-in-Hole Assembly Approach 
Based on Sensitive Robotics and Deep Recurrent 
Q-Learning 


Nehal Atef Afifi, Marco Schneider, Ali Kanso and Rainer Muller 


Abstract 


Sensitive robot systems are used in various assembly and manufacturing technologies. 
Assembly is a vital activity that requires high-precision robotic manipulation. One of the 
challenges faced in high precision assembly tasks is when the task precision exceeds the 
robot’s precision. In this research, Deep Q-Learning (DQN) is used to perform a very 
tight clearance Peg-in-Hole assembly task. Moreover, recurrence is introduced into the 
system via a Long-Short Term Memory (LSTM) layer to tackle DQN drawbacks. The 
LSTM layer has the ability to encode prior decisions, allowing the agent to make more 
informed decisions. The robot’s sensors are used to represent the state. Despite the tight 
hole clearance, this method was able to successfully achieve the task at hand, which has 
been validated by a 7-DOF Kuka LBR iiwa sensitive robot. This paper will focus on the 
search phase. Furthermore, our approach has the advantage of working in environments 
that vary from the learned environment. 
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1 Introduction 


Industrial robotics plays a key role in production, notably in assembly. Despite the fact that 
industrial robots are currently primarily used for repetitive, dangerous, or relatively heavy 
operations, robotic applications are increasingly being challenged to do more than simple 
pick-and-place activities [1, 2]. They must be able to react to their surroundings. As a result, 
Sensitive robot systems are capable of conducting force- or torque-controlled applications, 
which are used to achieve the previously mentioned contact with the environment. Although 
there is no clear definition of the term sensitivity, based on the measurement technology 
DIN 1319 norm, sensitivity is defined as the change in the value of the output variable of a 
measuring instrument in relation to the causal change in the value of the input variable [3]. 
Special control strategies are required in the case of a physical contact with the environment, 
since simple pure position control, as utilized in part manipulation, is no longer sufficient. 
Furthermore, relying just on force control is insufficient thus it makes sense to employ a 
hybrid force/position control [4, 5]. Depending on the task, it is therefore necessary to decide 
which of the transitional and rotational degrees of freedom are position controlled or force 
controlled [6]. The Peg-in-Hole assembly is an example of a robotic task that requires direct 
physical contact with the surrounding environment [7]. It has been extensively researched in 
both 2-D [8, 9] and 3-D environments [10, 11], and a variety of techniques for solving it have 
been presented [8—15]. Conventional online programming methods have been suggested and 
widely utilized with robots to train them to perform precise industrial processes as well as 
assembly activities, in which a teach pendant is used to guide the robot to the desired positions 
while recording each movement. This strategy is time consuming and challenging to adapt 
to new environments. Another approach is offline programming (simulation) [9, 12], and 
while it has many advantages in terms of downtime, it is difficult to simulate a precise actual 
environment due to environmental variance, and it is inefficient in industrial activities when 
the required precision exceeds robot accuracy. So due to the limitation of these techniques, 
a new skill acquisition technique has been proposed [11, 15], where the robot learns to do 
the high precision mating task using reinforcement learning [11]. 


2 State of the Art 


A variety of techniques in tackling Peg-in-Hole assembly challenges have been suggested 
[8-15]. This section will go over some of these strategies. Gullapalli et al. [8] investigated a 
2D Peg-in-Hole insertion task, focusing on employing associated reinforcement learning to 
learn reactive control strategies in the presence of uncertainty and noise, with a 0.8mm gap 
between peg and hole. A Zebra Zero robot with a wrist force sensor and position encoders 
was used. Their evaluation was conducted over 500 sequential training runs. Hovland et 
al. [15] proposed skill learning by human demonstration where they implemented a hidden 
Markov model. Nuttin et al. [12] ran a simulation with a CAD-based contact force simulator. 
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Their results show that the insertion is effective if the force level or time surpasses a particular 
threshold. Their approach focuses solely on the insertion while using reactive control with 
reinforcement as their strategy, in which the learning process is divided into two phases. The 
first phase is controller, where it consists of two networks: policy network and exploration 
network. The second phase is actor-critic algorithm, in which the actor calculates the action 
policy and the critic is responsible for computing the Q-value. Yun [9] imitated the human 
arm using passive compliance and learning. He used simulation, implemented in MATLAB, 
to solve a 2-D Peg-in-Hole task with a 3-DOF manipulator where he focuses on search phase 
only. The accuracy is 0.5 mm, and the training was done on a gap of 10mm. Their main goal 
of the research is to demonstrate the significance of passive compliance in association with 
reinforcement learning. We use integrated torque sensors with the deep learning algorithm, 
unlike Abdullah et al. [14], who used a vision system with force/torque sensors to achieve 
automatic assembling by imitating human operating steps, in which vision systems have 
limitations due to changes in illumination that may cause measurement errors. Also, unlike 
Inoue et al.’s [11] strategy, in which the robot’s movement is caused by a force condition 
in x and y directions, in our approach, the robot’s motion is discrete displacement action 
in x or y direction, because a motion resulting from a force condition raises the difficulty 
that such a force condition cannot be reached due to the physical interaction between the 
robot and the environment (e.g. the stick-slip effect), eventually resulting in a theoretically 
infinite motion. Furthermore, in contrast to the aforementioned approaches, the Peg-in-Hole 
task has not been conducted on a very narrow hole clearance, and some of these approaches 
were only confirmed with a simulation, which is not as exact as the real world, adding to 
the challenge of adjusting to actual world variance. Moreover, our approach has a higher 
advantage in adapting to variations in both hole location and environmental settings. As 
well as the ability to take actions based on a prior state trajectory rather than just the current 
state. It also has the capability of compensating for sensor delays. 


3 Problem Formulation and Task Description 


As previously stated, when the required level of precision of the assembly task surpasses 
the robot precision, it is difficult to perform Peg-in-Hole assembly tasks, and it is even more 
challenging to perform them using force controlled robotic manipulation. Our approach in 
solving the Peg-in-Hole task is employing a recurrent neural network trained with reinforce- 
ment learning using skill acquisition techniques [11, 13]. The first learned skill, which is 
known as the search phase, where the peg seeks to align the peg center within the clearance 
zone of the hole center. A successful search phase is followed by the insertion phase in which 
the robot is responsible for correcting the orientational misalignment. This paper focuses 
solely on the search phase. This research is done on a clearance of 30 ym using a robot with 
repeatability of 0.14mm and some millimeters positional inaccuracy. 
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4 Reinforcement Learning 


Reinforcement learning (RL) is an agent-in-the-loop learning approach in which an agent 
learns by performing actions on an environment and receiving a reward (r,) and an updated 
state (s,) of the environment as a result of those actions. The aim is to learn an optimal action 
policy for the agent that maximizes the eventual cumulative reward (R,) shown in Eq. (1), 
where (y ) indicates the discount factor, (r+) is the current reward generated from performing 
action (a,), and (t) denotes to the step number. The learned action policy is the probability 
of selecting an action from a set of possible actions in the current state [11, 16]. 


Ri = rt Y rl Hy? T2 H oon ty rar =r, +y R (d) 


Deep Q-Learning Q-Learning is a model-free off-policy RL technique. Model-free tech- 
niques do not require an environment model. Off-policy techniques learn optimal action 
policy implicitly by learning optimal Q-value function. Q-value function at a given state— 
action (s,a) pair is a measure of the desirability of taking action (a) in state (s) as illustrated 
in Eq. (2). 


k=T 
Q7 (s,a) =E > VÉ Tirk | St = 8, at =a 2) 
k=0 
Q-Learning employs the e-greedy policy as behavior policy, in which an agent chooses a 
random action with probability (e) and chooses the action that maximizes the Q-value for 
the (s,a) pair with probability (1-e) (see Eq. (3)). In this paper, exploration and exploitation 
are not set to a specific percentages. On the contrary, the exploration rate decays with a 
linear rate per episode as shown in Eq.(4). 


_ Ja~random(A;), with P =€ (3) 
~ | argmaxg O(s,a), withP=1-e 
En+1 = Einitial — €decay X N (4) 


The simplest form of Q-Learning is a tabular form which uses an iterative Bellman based 
update rule as seen in Eq. (5). Tabular Q-Learning computes the Q-value function for every 
(s,a) pair in the problem space, which makes it unsuitable for the assembly task at hand 
due to the complexity and variety of the environment. To overcome the tabular formulation 
drawbacks, DQN was introduced in [16] in which a neural network is employed as a function 
approximator of a (s,a) pair Q-value. 


Q (s,a) = Q (s,a) +a [|r + y maxa O(s',a') — O(s,a)] (5) 


Deep Recurrent Q-Learning While Deep Q-Learning can learn action policies for prob- 
lems with large state spaces, it struggles to learn sequential problems where action choice 
is based on a truncated trajectory of prior states and actions. This challenge urged the use 
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of another DQN variant which has a memory to encode previous trajectories. In this paper, 
a Deep Recurrent Q-Network (DRQN) is utilized as a suitable DQN variant. DRQN was 
introduced in [17] to solve the RL problem in partially observable markov decision process 
(POMDP). DRQN utilizes long-short term memory (LSTM) layers to add recurrency to 
the network architecture. The LSTM layer can encode previous (s,a) trajectories providing 
enhanced information for learning the Q-values. In addition, the recurrency can account for 
sensor and communication delays. 


Action and Learning Loops The Deep Recurrent Q-Learning algorithm is illustrated in 
Fig. 1. The algorithm can be divided into two parallel loops; the action loop (Green) and the 
learning loop (Yellow). The action loop is responsible for choosing agent’s action where the 
current environment state is fed through a policy network. The policy network estimates the 
Q-value function over the current state and the set of available actions. Based on e-greedy 
exploration rate, the agent action is either the action with the highest Q-value or a randomly 
sampled action as illustrated in Eq. (3). At each step, (sr, rt, ar, 5:41) experience is saved 
in a reply memory. After a predefined number of episodes, the agent starts learning from 
randomly sampled experience batches. Each experience batch is a sequence of steps with 
a defined length from a randomly sampled episode. The target network is an additional 
network serving as a temporary fixed target for optimization of the Bellman Eq. (5). The 
weights of the target network are copied from the policy network after a number of steps. 
The policy network estimates the Q-value of (s+, a) pair while the target network estimates 
the max Q-value achievable in (s;+1). The output from both networks is used to compute 
the proposed loss function in Eq. (6). Gradient descent is used to learn the policy network 
passed by back propagation of loss as illustrated in Eq. (7). 


Environment 


£ - greedy 


Experience Jim ee u ee 
Replay 


Random (Sar) 
Sampling 


(St+1) Copy parameters 


Network 
Per k iterations 


argmaxa Q(S: a) 


Evaluation 
Network 


1 Action Li 
Maxaı OS aes) SE 


Gradient descent 
eq. 11 
(Rt) Function 


Fig.1 Action and Learning Loops 


2. Leaming Loop 
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1 1 
Le = 5 [target — prediction? = 5 [r+ y maxa Oo(s’,a’) — Qo (s, a)] (6) 


0 <0 +a (r + y maxa Qos’, a’) — Qo (s,a)) VoQo(s,a) (N 


5 Search Skill Learning Approach 


This paper focuses on search skill which will be discussed in the following subsections in 
more details. Fig. 2 illustrates how the learning process is done. 


Initial Position Each episode starts with the peg in arandom position. The polar coordinates 
of the initial position are determined by a predefined radius from the hole’s center and a 
randomly sampled angle between 0 and 27 (see Fig. 3). The advantage of utilizing such an 
initialization method is that it maintains the initial distance to the hole center while searching 
the full task space. 


State At each time step, the reinforcement learning (RL) agent receives a new state sensed 
by the robot (see Fig.2 lower arrows) which consists of forces in x, y, and z (Fy, Fy, Fz), 
moments around x and y (Mx, My), and rounded positions in x and y (Py, P;) as seen in 
Eq. (8). In order to provide enough robustness against positional inaccuracy, it was assumed 
that the hole and the peg were not precisely positioned. P,, P, are computed using the grid 


I 


U =l oo 
Reinforcement Learning Robot Controller Assembly Task 


Fig. 2 Illustration of How Robot Learns New Skill Using Deep RL 


0 ~ [0,27] 
T = Z hole + r- cos(0) 
Y = Yhole + r- sin(0) 


Fig.3 Peg Initial Position Strategy 
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indicated in Fig. 4. where C is the positional error’s margin. This approach provides auxiliary 
inputs to the network, which can very well aid in the acceleration of learning convergence. 


S =| Fo. Py, Fes Mes My, Pr, P| (8) 


Action A deep neural network (policy network) is utilized in the current system to estimate 
a Q-value function for each (s,a) pair, which subsequently generates an action index to 
the robot controller. The action index is then utilized to assist the robot in selecting one of 
four discrete actions (see Fig.2 upper arrows), each of which has a constant force in the 
z-direction (F ey, According to Eq. (9), the agent must alter its desired position between x 


and y (+d! 


X- 


H + dd ) For all four discrete actions, the orientation of the peg (RI, R¢) is set 
to zero throughout the search phase. The advantage of maintaining constant and continuous 
force in the z-direction is that when the search algorithm finds the hole, the peg height drops 
by a fraction of a millimeter, which is a success criterion for the search phase. 


a= |as, at, F4, RÌ, Rt (9) 


Reward A reward function is used to evaluate how much an agent is rewarded or punished 
for performing an action in the current state. The reward (r+) is calculated after the completion 
of each episode in this proposed methodology. The reward zones for our search task is 
illustrated in Fig.5. First, the inner circle (Green Zone) indicates that the peg has either 
reached the goal position or the maximum number of steps (kmax) per episode is reached 
with the peg close to our goal. Inside the second circle (White Zone), the peg is at a distance 
less than the initial distance (d,) and receives a reward of zero. Moreover, when the robot 
moves away from the starting position toward the boundaries of the safety limits (Yellow 
Zone), the agent receives a negative reward. Finally, the working space barrier is the outer 
square (Red zone), which indicates that the peg is violating the safety restrictions (D) and 
receives the highest negative reward. 


Fig.4 Examples of Peg Position Rounding Approach Using Grid Size 
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0, d<d, 

On Failure: r= { —$-%., do<d<D 
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k 
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re 


Fig.5 Different Reward Zones 


6 Implementation and Validation 


A KUKA LBR iiwa, which is a sensitive robot arm with an open kinematic chain and 
integrated sensors, is used for this work. The integrated torque sensors are based on strain 
gauges in each of the robot’s joints that enable for the determination of external forces and 
torques acting on the robot. Force-controlled robot applications are therefore possible when 
combined with the control approach discussed. The peg and block used in this study are 
made of corrosion-resistant stainless steel, which is ideally suited for this purpose due to the 
continuous force exerted during the experiments. The clearance of the peg and the hole is 
30 um. The experimental setup is displayed in Fig. 6. As mentioned before, such assembly 
is done with the assistance of artificial intelligence as the task accuracy exceeds the robot 
precision. According to KUKA, the position repeatability of the LBR iiwa lies at + 0.15 mm 
[18]. This could be proven by DIN 9283:1998 with the help of a high-precision laser tracker 
API R50-Radian with an accuracy of +10um + 5um/m. The measured repeatability was 
0.14mm, which equates to around five times the clearance between the peg and the hole. 
In order to assure data flow between the DRQN and the robot, Message Queuing Telemetry 
Transport (MQTT) was used. MQTT is a bidirectional network protocol based on the client- 


Fig.6 Experimental Setup: in simulation (left) and in reality (right) 
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server principle rather than the end-to-end connection paradigm like many other network 
protocols. Messages are not sent directly to clients; rather, communication is event-based and 
follows the publish-subscribe paradigm [19]. In order to validate our approach, we conducted 
experiments by running 200 learning episodes followed by some test runs. In order to achieve 
near optimal hyperparameter values, a few tests were conducted by maintaining all variables 
constant and adjusting one at a time. Throughout the training, the agent was able to identify 
the hole 130 times out of a total of 200 times. Two test trials were conducted, in which 
the agent was able to locate the hole 18 times out of 21 and 27 times out of 31, for an 
overall success rate of 86.5%, and as shown in Fig. 7c, the loss decreases during the training 
process. Additionally, the loss curve also demonstrates a well-chosen learning rate. Moving 
on to Fig. 7a, the graph shows that the peg strives to stay near to the hole position and only 
drifts further away a few times. Experiments revealed that a sparse reward function Fig. 7b 
is not the best fit for the search challenge, and that more dense reward functions should be 
investigated. Fig. 7d shows a cutout from the trajectory using two cases where the hole was 
identified (Success) and one case where the defined limit were exceeded (Failure). 
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Fig.7 Experimental results 
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7 Conclusion and Future Work 


This research demonstrated and validated the success of our proposed strategy using DRQN 
in addressing a high-precision Peg-in-Hole assembly task using a 7-DOF sensitive robot 
with integrated sensors. The employed approach was successful in completing the search 
phase. It was also shown that integrating recurrence into a reinforcement learning system via 
an LSTM layer overcomes DQN’s drawbacks, where the LSTM layer was able to encode 
previously taken decisions, allowing the agent to execute a better informed decision and 
overcome sensor delays. In the future, the approach will be extended to the insertion phase 
as well as improving the network architecture, including tuning the hyperparameters in order 
to reach an overall success rate of 100%. In addition we are planning to evaluate continuous 
action space techniques such as DDPG, DPPO, or NEAT, which should potentially enhance 
the performance. 
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Generalized Model for the Optimization of a LIFO 
Topology Storage Using a Metaheuristic Algorithm 


Dominik Kuhn, Jan Adelsbach, Martin Karkowski and Rainer Muller 


Abstract 


LIFO topologies, due to their simplicity and high degree of space usage efficiency are 
common in applications for which a flexible and cost effective storage solution is required. 
This topology however represent an optimization challenge due to insertion and removal 
constraints. A scalable generalized model for the optimization of this topology using 
a population based metaheuristic algorithm is presented in this paper. The model to 
represent this storage topology, in a way suitable for population based metaheuristics and 
the implementation thereof are being discussed. It is being validated using practical usage 
scenarios from logistics and assembly such as non-stacking condensed pallet storage. 


Keywords 


LIFO » Metaheuristic « Genetic algorithm 


1 Introduction 


Last-In-First-Out (LIFO) type storage is used in logistics and assembly systems, where a 
high degree of storage efficiency is required. Application scenarios include part shelves in 
assembly, automated guided vehicles in matrix production [1] and in general warehousing. 

A particular optimization problem of this storage type is the reduction of item shuffling, 
that is the reduction of item relocation in order to clear access to an item that is to be 
retrieved. In an optimal case the item would always be at the front of the respective strip. 
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However this can only be guaranteed if the strip is homogeneous in terms of item types. As 
described in [2, 3] certain types of industries such as food, medical and chemicals this is not 
always possible due to expiration dates and production batch requirements. Furthermore a 
type-pure homogeneous storage would assume enough strips to handle both all item types 
as well as any potential backlog thereof, which is unrealistic in many application scenarios. 

Usually, an attempt is made to describe this problem mathematically and solve it by 
the means of Linear Programming (LP) problem such as described in [4]. Many works 
are grouped under the term Storage Location Assignment Problem—SLAP, many of which 
do not explicitly consider the LIFO topology. In a previous paper [2], we presented an 
approach that deals precisely with the problem of optimizing this type of a LIFO storage using 
metaheuristics. In this paper, we would like to discuss the structure and further development 
of the data structure, as well as the effects of the different approaches. 

The original motivation of this work was the optimization of a LIFO type palette ware- 
house of a beverage producer. However given the aforementioned further application scenar- 
ios and little research work on this subject it was decided to further generalize the approach 
as we See potential in its use in adaptable assembly systems. 

We first define the model used for the algorithm, the prior work done and give a brief 
overview of the theoretical basis. Subsequently we describe the evolution of the data structure 
used and the convergence behavior as well as the concepts of exploitation versus exploration 
with the said data structure. 


2 Problem Description 


The general model concerned with this optimization problem is that of a LIFO type storage 
consisting of an arbitrary amount of strips. Each strip acts like an individual LIFO row, in 
that items can only be removed in the reverse sequence of them being stored. This mirrors 
a typical generalized application scenario of LIFO storage, such as palette warehousing. 

Following a similar definition as [3], let W be a storage composed of ns strips, W = 
{1,..., ns} which are each may contain n; items W; = (s1, ...). Given a sequence of n; 
items to be placed into the storage Z = (z1, ...) The goal is to find a configuration for the 
placement of the items in the storage, such that using a scoring method f(W) — R that the 
latter sits at Pareto optimality. 

The implementation of f(W) depends on the desired properties and can be implemented 
to examine single items, strips or the storage as a whole. Depending upon the formulation this 
can represent a multi-objective optimization problem. For example [2, 5] describe possible 
variants examining the homogeneity of strips in various manners and their combination to a 
single score. Further multi-dimensional approaches could further assess the access to items 
through empty neighbouring strips. 
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3 Previous Works 


Current research approaches consider single item allocation, such as [3] with an ad-hoc 
placement strategy and a greedy relocation procedure, [5] with a genetic algorithm refined 
with simulated annealing or [4] with a linear model. However in those algorithms the place- 
ment of only single items is being considered, restricting the view of potential storage 
optimality that the consideration of multiple items could offer. 

It is therefore advantageous to optimize the storage allocation for multiple items at once. 
This allows the algorithm for example to take into account the current production schedule. In 
the previous work [2] we present an approach to solve the allocation problem for an ordered 
batch of items. The focus of the latter was on illustrating and explaining the problem in 
form of an implementation of a genetic algorithm. The implementation of typical storage 
restrictions in terms of the fitness function formulation was furthermore handled in detail 
and mathematically described. It should be noted that the way in which the fitness methods 
are implemented has a significant influence on the runtime and parameterisation of the 
algorithm. 

This approach has been developed in the Python programming language based a genetic 
algorithm module. The latter module was extended by a number of adapted fitness, crossover 
and motivation methods. A meta class inherits the original genetic algorithm module and 
overloads certain functions with extensions with regards to the LIFO data structures as 
described below, debugging and profiling functionality. Using an application programming 
interface (API) real-time data from a regional beverage producer who uses such a LIFO 
storage was used to test the approach using a real scenario. The latter contains storage 
capacities of up to 5000 pallets, at a varying strip size with up to 30 pallets. The number 
of pallets to have their storage location optimized are taken from a production queue which 
varies between 8 and 12 pallets. 


4 Theoretical Basis 
4.1 Biological Model and Basic Idea 


The idea of genetic algorithms is inspired by evolutionary processes in nature, through which 
individuals adapt more increasingly to environmental conditions. The principle of which is 
described e.g. in [6, 7], goes back to Charles R. Darwin, who proclaimed it as “survival of the 
fittest”. The basic idea of transferring this approach to mathematical optimisation problems 
goes back to a work by [8] as a method of metaheuristics. 

As an evolutionary optimisation method, the solution is represented by an individual in 
genetic algorithms. Multiple individuals together build a population and thereby a set of 
solutions in a search space. 
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The individuals of each generation must be characterised with regards to the problem 
under consideration for solution quality. This task is usually performed by a fitness function 
that quantifies the quality of an individual with the use of a real value. 

Corresponding to the biological mechanisms of crossing, mutation and selection, there 
are also methods to enable the population of individuals to evolve. 


4.2 Structure and Function of the Genetic Algorithm 


For a better understanding we first briefly review the workings of a genetic algorithm in 
this section. The algorithm consists mainly of the following components and phases. These 
phases are executed in an iterative manner: 


representation (definition of individuals), 
initialization 

evaluation function (fitness function), 

variation operators like recombination and mutation, 
selection mechanism, 

Repeat from step 3 until a stop criteria is reached. 


NR DD. 


The first step of an evolutionary algorithm like genetic algorithm is to define a description 
or representation of the context and the search space of the problem. This often involves 
simplifying or abstracting a real-world problem to derive a clearly defined context. It must be 
decided how a possible solution should be coded and stored so that it can be processed by a 
computer in the given programming language. Simple problems are often implemented in the 
form of binary permutations, so called genotype representations. More complex problems, 
must be implemented using more complex data types, as simple binary encoding is no longer 
sufficient. In this case, several structures are often used to map the relationships. 

After coding the problem, the next step is to initialise a start population randomly. After 
initialising a starting population, the fitness of the individuals is calculated using a suitable 
fitness function. 

A selection method is used to pick out a given number of the fittest individuals from the 
current population in order for them to be transferred into the next population. This usually 
takes into account a stochastic component when selecting the fittest individuals [6, 9]. 

During recombination, new individuals are generated from two selected individuals with 
a crossover operator. From the newly emerged individuals, candidates for mutations are 
selected with a low probability to strengthen the exploration of the search space. Mutation 
involves changing a random part of the individual. Various mutation methods are evaluated 
in [10]. Depending on the coding chosen, a mutation method has a different impact on 
exploration, depending upon the problem certain mutation operations are not effective and 
produce useless solutions in the search space [9]. 
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Sparse Vector Approach Dense Vector Approach 
Strip 12 3 4 Strip 12 3 4 
Map-Vector | 1/1|1]2|2/4 Map-Vector 
Individual-Vector | [DIL] Individual-Vector |1|1|2 


Fig.1 Sparse- and dense data structure for the same item configuration. Shaded boxes in the storage 
are occupied, empty boxes are free. Numbered vector elements are strip indices 


5 Data Structure 


The data structure in the previous work [2] was initially chosen in such a way, that standard 
crossover and mutation operators for the traveling salesman problem such as described in 
[6] can be applied, yet such that the interpretation of the data structure keeps track of the 
LIFO principle and the item sequence. 

This originally was implemented as two vectors, one map vector in which every element 
would correspond to an empty slot in the LIFO storage in terms of the strip number. That 
is if a strip i has n free slots the vector would contain n elements with value i. A further 
vector, the individual of an equal size and corresponding by index to the map vector, would 
be occupied either by an empty indicator or by an identifier for the item to be placed (Fig. 1). 

This individual vector would then be interpreted sequentially from left-to-right by skip- 
ping over empty elements and pushing items into the farthest free position from the front of 
the corresponding strip. This is illustrated in Algorithm 1 for an individual vector J, a map 
vector M which are both of size n, as well as a function to push the item onto the back of a 
strip pushToStip(strip, item). 


Algorithm 1 Sparse vector to storage mapping 
Require: I, M, pushToStrip(strip, item) 
for i := 1 ton do 
if I; A Ø then 
PUSHTOSTRIP(M;, I;) 
end if 
end for 
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Standard mutation and crossover operators, need a continuous vector filled with values. In 
order to facilitate the use of those operators another vector would be generated from the 
individual in which all empty element indicators are filled with values that are distinct from 
from the item identifiers. 

After the application of the standard operator the filled in values are replaced again with 
empty element indicators, such that the resulting data structure matches the representation 
as described above. 

A problem with this data structure is that due to the large sparsity of the vector probabilistic 
selection methods will often pick empty elements. Attempts were made to overcome such 
problems by having probabilistic operations only choose from occupied elements by deriving 
an indexed vector of the non-empty elements such as described in the concepts of [11]. 

Furthermore attempts were made to minimize the vector size in overall, by limiting the 
amount of free slot elements per column to the maximum amount of items to be placed, or 
filtering out the worst scoring strips. 

In order to address these issues the data structure was redesigned such that the individual 
vector is a standard dense vector. In this approach the map vector has the size of the items 
to be stored and their identifiers. The individual vector on the other hand then contains the 
strip numbers, the number of free slots per strip is stored in a separate data structure which is 
implemented using a standard dictionary. Care needs to be taken to ensure that the individual 
vector only contains each strip index at a maximum amount of the free slots in that strip. The 
latter requires special handling for the initial population generation and mutation operators. 

This vector is still interpreted from left-to-right, in order to maintain the storage order, 
following the same terminology as for Algorithms | and 2 shows how this structure is mapped 
into a storage. 


Algorithm 2 Dense vector to storage mapping 
Require: I, M, pushToStrip(strip, item) 
for i := 1 ton do 
PUSHTOSTRIP(/;, Mj) 
end for 


In this approach the crossover and mutation functions need to be significantly adjusted to 
handle and modify the vector. In particular a mutation function that does not merely swap 
elements will need to ensure that when altering the strip index that the total amount of 
occurrences of the latter in the individual are less than or equal to the amount of free slots in 
the strip. This can be achieved by first counting the free slots for every strip in the storage 
and then subtracting and removing the slots claimed by the individual. The mutation can 
then select a new strip index from the remaining ones. 

In order to use standard crossover functions distinct values are required, this can be 
achieved by operating on a shadow vector with increasing values representing the indices 
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of the individual vector. Upon applying the crossover to the shadow vector it is then used as 
a permutation to the individual. 


6 Convergence Behavior 


The convergence behavior can be assessed both in terms of computational performance and 
maximum achieved optimality. In order to provide comparable results for the data structure 
approaches, a snapshot of a warehouse configuration from a beverage manufacturer was 
taken and used to validate the performances. 

In order to provide performance improvements the fitness of all strips in the storage as-is 
are computed upfront. During the execution of the algorithm the fitness is only recomputed 
for strips into which items have been designated to, in accordance to the configuration of 
the individual. This allows fast computation of the fitness score for the whole storage. This 
results in a predictable performance pattern, where if items are clustered, with an iteratively 
decreasing spread across the storage, the execution time of the fitness function over the 
population decreases. This behavior can be examined in Figs. 2 and 3. This pattern applies 
to either data structure used, with the sparse structure having a significantly higher offset 
time than the dense structure, with the former taking 10 to 20 times longer to compute than 
the dense method. 
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Fig. 2 Average fractional execution time of the different steps in the genetic algorithm using the 
dense vector approach from around 100 executions 
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Fig.3 Convergence behavior of the dense vector approach using exploitation only and both exploita- 
tion and exploration. The higher the convergence score, the better 


In terms of absolute performance, the choice of data structures, in particular for dictionary 
or map type containers that are used for various lookup operations can significantly affect 
the performance. However in this case either algorithm was implemented using standard 
Python language container data structures, as such the performance increase is in a relative 
relation. 

In terms of the optimality the algorithm sometimes gets stuck in a local maximum. Using 
an elitist genetic algorithm, this can result in multiple generations without improvement 
followed by erratic jumps. This behavior presents a challenge for an appropriate stop cri- 
terion. In the current algorithm a defined amount of the last n fitness values are being kept 
and examined for a sufficient change 5. Once this falls below a threshold the algorithm is 
stopped. Currently it is found that appropriate values are to examine the fitness scores of 
the last 20 generations to have a change of at least 0.01 if the fitness score is in a range of 
[-1,1]. 


7 Exploration vs. Exploitation 


In a genetic algorithm, the mutation methods have the task of introducing new entropy to 
an individual in order to expand the search space within a generation, this is called the 
exploration. The crossover method on the other hand is in charge of combining the best 
aspects of two individuals in order to inherit an improvement into the next generation, this 
is the exploitation. Without mutation methods and purely with crossover methods, there is 
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a risk that the algorithm will remain in a local extremas. The algorithm without crossover 
methods on the other hand will be probabilistic when based solely on the mutation methods. 

As far as the nature of the mutation function is concerned, it is noted that the structure 
of the data leads to significant differences in terms of exploration in the prior described 
model. If the problem is represented in the form of a single sparse vector, mutation methods 
on their own are able to generate a proper exploration rate. This is the result of the vector 
attaining a new configuration by randomly shuffling the elements around. If the data structure 
however is implemented in the form of a condensed vector as described above for the new 
approach, exploitative mutation methods that shuffle elements such as CIM and RSM as 
described in [10] offer little to no exploration. This is due to the dense vector approach 
only containing a subset of the possible configurations and as such not expanding the search 
space beyond what was initially generated with the individual. In this case in order to obtain 
a desired exploration rate new configurations need to be introduced, this can be achieved 
with mutation methods that not merely shuffle elements around but also can alter them, an 
example of which is the Twors mutation [10]. Figure 3 illustrates the behavior of the dense 
vector approach when using only exploitative versus explorative mutation methods. 


8 Discussion 


The method described herein assumes that the strips are constrained to insertion and removal 
from the front only in a true Last-In-First-Out fashion. Methods to use the neighbouring strips 
to access items are not discussed, but could be further considered in the fitness methods, if 
applicable. This approach still considers all items as single entities, even if they are of the 
same type. A further refinement could be to consider clusters of items in order to improve 
convergence behavior and item location quality on otherwise sub-optimal cases. 

As mentioned, the stopping criteria is a particular problem for this type of algorithm. 
Advanced methods utilizing early stopping techniques such as described in [12] were not 
further explored or evaluated for their applicability in this context. 

The size of the population in relation to the size of items to be stored as well as the size 
of the storage has not mathematically quantified. As a consequence of this the population 
size is based on intuition or trial and error. 

By comparison to the sparse vector approach the dense vector approach yields a significant 
performance saving impact at the cost of a higher complexity. This results in the dense 
approach being harder to assess in its theoretical performance by comparison to the sparse 
vector approach. 
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Further research could be conducted in order to explore the addition of memetic algorithm 
approach based on the work of [13] in which individuals are further refined. 

Adaptive assembly systems allow many degrees of flexibility but are inherently complex 
in their optimization, we hope to adapt the described algorithm for this usage scenario. But 
due to the inherent complexity of adaptive assembly systems we explore this approach first 
using more simplistic but real world usable application scenarios. 
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for Automated Assistance Systems 
for the Classification of Tool Wear 
on Milling Tools 
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Abstract 


Tool wear and the decision when to replace tools is a universal challenge in the metal 
cutting industry. While the tool wear state can be accurately determined using optical 
measuring methods, the tool wear of milling tools is often examined by the CNC- 
machine operators, especially in small and medium enterprises. In order to increase 
the accuracy with which tool wear can be correctly classified, it is advisable to use 
an assistance system that automatically removes the tools from a buffer, examines the 
tool wear state based on visual sensor data and sorts them into separate boxes accord- 
ing to the classification result. In this context, the accurate classification of tool wear 
is a key capability that can be enabled using methods of machine learning, based on 
image data that was labeled by human experts. In this paper different machine learn- 
ing models are examined based on their ability to classify images of milling tools into 
the categories worn and not worn. The EfficientNet_bO model achieves an accuracy of 
91.47% and outperforms human experts that classified similar images by 22.87%. 
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1 Milling Tool Assessment in the Machining Industry 


In the manufacturing industry the product quality needs to be optimized and produc- 
tion cost minimized in order to compete with other enterprises. While the usage of worn 
tools decreases product quality the underuse of a tools remaining lifespan results in an 
increase in production cost [1]. In order to maintain a sufficient product quality only 
50%-80% of the mean tool life is generally used [2]. 

Thus, arises the need for effective assistance systems to determine tool wear in order 
to reduce production costs by as much as 10%-40% [3]. 

In medium and small enterprises, the decision whether or not a milling tool can still 
be used is often made by the machine operators. While they have specific tools such as 
magnifying glasses or microscopes at their disposal when handling the milling tools, the 
classification is still subjective and contains an underlying, individual bias, which can 
lead to different individuals classifying the same milling tool differently. 

In order to classify tool wear on a more accurate, deterministic basis an automated 
assistance system is necessary. This assistance system could, by means of an industrial 
robot, remove milling tools from a predefined buffer and then feed them to a camera, 
which takes several images of the milling tool. Using these images, the tool wear could 
be examined. Following the image-based classification, the milling tools could be sorted 
into separate, predefined output buffers based on the classification results. In the context 
of the described assistance system, the required image processing of the collected image 
data is a key component. Methods of machine learning can be used in order to classify 
the images, therefore enabling the usage of the aforementioned system. 


2 Classification of Tool Wear Using Methods of Machine 
Learning 


Tool wear can be classified using either an indirect on a direct approach. The indirect 
approach utilizes cutting parameters such as force, vibration, acoustic emission or the 
measured power of the CNC-machine [4—7]. Since these parameters can be measured 
during the milling process, no intervention in the process is necessary to draw conclu- 
sions about tool wear [5, 6]. Using statistical methods, the indirect approach determines 
a correlation between tool wear and the recorded sensor signals as a basis to classify tool 
wear [7]. 

In the direct approach, the tool wear is measured by means of optical sensors via the 
geometric properties of the tool [4, 6, 7]. For the optical measuring of the tool wear it is 
generally necessary for the milling tool to be removed from the machine [6]. This disad- 
vantage causes machine downtime [7]. The direct measurement of the tool wear offers 
a higher recognition accuracy under ideal conditions than the indirect approach [4, 6]. 
Uncertainties may arise from the interpretation of the image data by human operators 
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[4]. The presence of chips or cutting fluids in the image data effects the recognition accu- 
racy as well [4, 6]. 

Classical methods of computer vision such as the sobel- and canny algorithms as well 
as the active contour method have been applied in the literature to detect tool wear [4]. 
Deep learning approaches outperform classical approaches in regard to the classification 
of images [8]. Additionally, methods of machine learning are more robust in terms of the 
classification accuracy towards changing light conditions [4]. 

Methods of machine learning can be used in order to classify tool wear based on both 
the indirect and the direct approach. Neural networks are the most used method for the 
indirect classification of tool wear [6]. Machine learning approaches such as neural net- 
works are able to extract knowledge from large amounts of data and map this knowledge 
in a model, which is then able to apply the learned knowledge to the specific application. 
For the classification of tool wear, deep learning methods are particularly suitable, since 
they can detect patterns in the input data independently, which is why external feature 
detection is not necessary [9]. These methods require a large amount of data, which is 
not always accessible [6, 10]. This is particularly true for use-cases where expert knowl- 
edge is required to label the data. For these specific cases, which includes the classi- 
fication of tool wear, deep learning approaches such as ensemble learning or transfer 
learning look promising [6]. 

Since deep learning methods are able to detect features in the datasets without the use 
of external feature detection algorithms, they can be used to find correlations in sensor 
data, which is recorded during milling processes. This data can be processed through the 
use of deep learning methods such as convolutional neural networks (CNN) [10]. This is 
done by encoding the time series data as images which can then be processed by CNN 
[11, 12]. 

Table 1 shows an overview of the presented literature and their key parameters. The 
indirect approaches using sensor data in order to classify tool wear reach an accuracy of 
86% to 90%. These approaches use sensor data based on the entire lifespan of milling 
tools in order to classify the wear. The predicted classes range from no wear to steady 
state wear and finally tool failure. The direct approaches use image data to classify tool 
wear. The approach proposed by Wu etal. does not classify whether an image depicts 
a worn or a not worn tool but different kinds of wear phenomena [13]. Bergs et al. use 
image segmentation instead of image classification to detect tool wear. Therefore, they 
use the Intersect over Union (IoU) metric instead of accuracy to evaluate their results. 
Ambadekar et al. classify images of surface quality of workpieces in order to classify the 
wear of the used cutting tool. Using this approach, they classify the wear state of the tool 
with an accuracy of 87.26% [9]. 

Many state of the art CNN-architectures are trained on publicly available datasets, 
such as the ImageNet dataset, in order to evaluate their performance. CNN trained on the 
ImageNet-dataset are observed to be biased towards detection textures instead of object 
shapes [14]. This property is beneficial for the detection of tool wear, since the detection 
of tool wear is a texture recognition problem [4]. This should enable network architec- 
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Table 1 Comparable approaches for the classification of tool wear 


Authors Approach Architecture Size of Training Dataset Accuracy 

Zhang et. al [6] Indirect CNN 164 milling processes 86% 

Guarri et al. [11] Indirect CNN 220 90% 
images 

Martinez et al.[12] Indirect CNN 14,000 89% 
images 

Bergs et al.[4] Direct U-Net 3000 IoU 0.73 
images 

Ambadekar et al.[9] Direct CNN 1183 87.26% 
images 

Wu et al. [13] Direct CNN 5880 96.20% 
images 


tures that are good at classifying images on the ImageNet dataset to reliably classify tool 
wear. CNN-architectures such as VGG [15] or ResNet50 [16] have successfully been 
used to classify tool wear [9, 13]. 


3 Approach to Aligning the Classification Accuracy of a 
Machine Learning Algorithm With Expert Knowledge 


3.1 Image Acquisition Device 


The images necessary to train the neural network are taken using a Nikon D5600 camera 
using a Sigma 150 mm camera lens. The Camera is mounted on top of a special fixture 
which prevents relative movement between the camera and the tool holding fixture for 
the milling tools. This ensures that the images are taken under identical initial condi- 
tions. To prevent image blur when the camera shutter button is pressed, a remote control 
is used. The tool holding fixture, in which the milling tools are inserted can be rotated 
around its axis by 360 degrees. The edges of the tool holding fixture allow it to be man- 
ually turned in intervals of 45 degrees, so that the entire circumference of the milling 
tools can be photographed though eight individual images. In order to capture images of 
the front side of the milling tools the tool holding fixture can be attached at a different 
angle. The entire image acquisition device is placed inside of a photo box when taking 
the images in order to ensure constant illumination, as shown in Fig. 1. The photo box 
contains LEDs which illuminated it with diffuse light. Tool holding fixtures of different 
diameters can be used for different milling tools. 
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Fig. 1 Image acquisition device within a photo box 


3.2 Dataset and Preprocessing 


The dataset acquired by using the aforementioned device consists of 328 images of 41 
different milling tools. These milling tools were classified by an expert into the catego- 
ries worn or not worn, using magnifying glasses or microscopes. This classification is 
taken as the ground truth for the images in the dataset. Therefore, uncertainties in the 
dataset can be expected. The dataset is split into different subsets for training, validation 
and test at a ratio of 60:20:20. In order to enable the used method of machine learning to 
process the image data more efficiently, the images are preprocessed. 

Initially the images are cropped, so that the majority of the background, which con- 
tains no information of the tool wear, is removed, therefore reducing the size of the 
image. In order to further increase the datasets, different filters are applied to the indi- 
vidual images, which increases the robustness of the model after training [4]. These 
filters include the increase of contrast, the increase of illumination, as well as the use 
of a sharpening and softening filter. Image augmentation techniques such as translation 
and rotation are not used, since the position of the milling tools relative to the camera 
is fixed. Therefore, these augmentation techniques offer no benefit. By applying these 
filters to the images, the dataset is increased to 1640 images. Fig. 2 shows four images of 
the same milling tool with the different filters. 
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Fig.2 Images of same milling tool using different filter. From left to right: contrast, illumination, 
sharpening, softening 


3.3 Convolutional Neural Network Implementation 


State of the art CNN-architectures such as VGG and ResNet50 are capably of classify- 
ing tool wear since the recognition of tool wear is a texture recognition problem instead 
of an object detection problem as described in the previous chapter. In order to classify 
the wear on milling tools based on the dataset, several training runs are conducted using 
the VGG [15], ResNet50 [16] and EfficientNet_bO [17] architectures. The EfficientNet 
scores a better result than the VGG and ResNet50 when trained on the ImageNet dataset, 
while utilizing less parameters and training quicker. Since a large number of parameters 
is one factor that attributes to overfitting, which was observed in previous papers when 
classifying tool wear using VGG and ResNet50, the EfficientNet is employed as well. 

Transfer learning is one possibility to reduce the effect of overfitting, especially for 
small datasets. In order to evaluate the classification results of the different architec- 
tures and the influence of the use of transfer learning based on the ImageNet dataset, 
the VGG-16, VGG19, ResNet50 and EfficientNet_bO model are trained with and with- 
out the usage of pretrained weights based on the ImageNet dataset. The base models are 
extended by the following layers, in order to fine tune the model to be able to detect tool 
wear. After the base model global average pooling is used. Following the global average 
pooling a fully connected layer with 128 neurons is added. This fully connected layer 
(FC-Layer) uses the ReLu activation function. The last layer is another fully connected 
layer with two neurons, representing the two possibly classification results. This layer 
uses the SoftMax activation function. The architecture is depicted in Fig. 3. 

For the training of the model a NVIDIA RTX 2060 graphics card is used. The code 
was implemented in python, using the TensorFlow 2.6 framework [18]. 
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Fig.3 Architecture of the CNN 


Table 2 Results of the training runs 


No [Mode] © |Transfer Learning |Trainabe |Test | Test Loss 
Parameters Accuracy 
1 VGG-16 No 14,779,580 0.5 0.632 
2 VGG-16 Yes 64,892 0.6765 1:3373 
3 VGG-19 No 20,089,276 0.5 0.6933 
4 VGG-19 Yes 64,892 0.7206 1.3994 
5 ResNet50 No 23,850,242 0.7676 0.9932 
6 ResNet50 Yes 262,530 0.8971 0.5377 
7 EfficientNet_b0 No 4,171,774 0.54 0.6932 
8 EfficientNet_b0 Yes 164,226 0.9147 0.1791 


For training the models the adam optimizer is used. The loss function is categorical 
cross entropy. The images used for the training are passed to the models in a resolution 
of 405 x 150 pixel in batches of eight images. The images are downscaled in order to 
reduce the number of parameters and increase training speed. The models are trained for 
50 epochs. Since the accuracy of the models does not increase after a certain number of 
epochs, no further training runs above 50 epochs are conducted. 


4 Results 


The results of the training runs are shown in Table 2. Transfer learning significantly 
improves the accuracy and decreases the loss of every model. The models perform in 
accordance to their performance on the ImageNet dataset. The VGG-16 model scores an 
accuracy of 50% without the use of transfer learning and 67.65% with the use of trans- 
fer learning. The VGG-19 model scores an accuracy of 50% without the use of trans- 
fer learning and 72.06% with the use of transfer learning. The second-best model is the 
ResNet50, which scores 76.76% without the use of transfer learning and 89.71 % with 
the use of transfer learning. 

The best overall accuracy is achieved by the EfficientNet_bO model with transfer 
learning, which scores an accuracy of 91.47%. This model also achieves a significantly 
lower loss on the test dataset. 
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Fig.4 Accuracy and loss of the EfficientNet_bO model without transfer learning 
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Fig.5 Accuracy and loss of the EfficientNet_bO model with transfer learning 


The course of the accuracy and error of the EfficientNet_bO model using no transfer 
learning are shown in Fig. 4. The accuracy on the training dataset shows a linear increase 
in the first 20 epochs before rising significantly and converging to one. The loss on the 
training dataset shows a drop at the first epoch and remains almost constant for another 
twenty epochs. After twenty epochs the loss decreases in a volatile manner to values 
between 0.2 and zero. The accuracy on the validation dataset barely increases at all. The 
loss on the validation dataset is highly volatile and does not decrease below a value of 
0.7. 

Figure 5 shows the course of the accuracy and error of the EfficientNet_b0 model 
using transfer learning. Similarly, to the model without transfer learning the accuracy 
and error of the model on the training dataset converge to one and zero respectively. 

The significant difference between the models is that the convergence is achieved at 
a substantially faster rate. The validation accuracy increases significantly in the first few 
epochs and converges around 0.9. The error on the validation dataset decreases signifi- 
cantly in the first few epochs and up to a value of 0.2. While the course of the validation 
error on the model with transfer learning is still volatile, it is significantly steadier than 
the validation error of the model that does not make use of transfer learning. 
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The confusion matrix of model eight is shown in Fig. 6. The correct classifications 
on the test dataset are shown on the main diagonal of the matrix. 46.7% of the tools that 
show no tool wear were classified correctly. Tool wear is correctly classified in 44.7% 
of cases. The model misclassifies 5.29% of the samples were tool wear is present and 
3.26% of the samples were no tool wear is present. This results in 91.4% accuracy, 
which is higher than most accuracies that can be found in the literature. 

The dataset that is used to train, validate and test the models consists images of mill- 
ing tools which were classified into the categories wear or no wear by a human expert, 
as described in Sect. 3.2. Therefore, the labels in the data set can be expected to contain 
an individual bias, resulting in some uncertainty in the classification of tool wear by the 
models. Thus, it is unlikely that 100% accuracy can be achieved. Taking this background 
into account, the classification accuracy achieved by EfficientNet_bO is all the more 
remarkable. The results show that it is possible to reproduce human expert knowledge 
using CNN without having to perform metrological evaluations for the annotation of the 
data. In comparison to the existing literature, it was not investigated whether or which 
wear can be detected, but whether the tool would be classified as worn or not yet worn 
by a human expert. In contrast to the classification of different wear features, this type of 
classification is particularly challenging, as the number and extent of the wear features in 
the image have a non-trivial influence on the wear condition of the depicted tool. Con- 
trary to the approaches in the literature, all potential wear features have to be considered 
at the same time. 

In order to evaluate how well the model is able to match the knowledge of machine 
operators regarding the classification of milling tools, ten different milling tools were 
classified by 14 different machine operators using magnifying glasses to help them with 
the classification. These milling tools are a subset of the tools used for the creation of the 
data set and were classified by the same expert for their wear condition. The results are 
shown in the right matrix of Fig. 6. The average accuracy of the 14 humans when clas- 
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Fig.6 Confusion matrix of the EfficientNet_b0 with transfer learning 
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sifying these ten milling tools is 68.6% which is significantly lower than the model accu- 
racy on the test dataset. Thus, it can be concluded that the CNN is able to match human 
expertise very well. 

Therefore, the usage of an assistance system which classifies the tool wear on mill- 
ing tools based on images can be an effective tool for humans in making these deci- 
sions while handling the milling tools, thus reducing production costs and conserving 
resources. 


5 Conclusion and Outlook 


Dealing with tool wear is a challenge faced by every company in the machining industry. 
The decision whether or not to a tool can still be used is often made by human machine 
operators, specifically in small and medium enterprises. Tool condition monitoring sys- 
tems can help to provide objective decision making and therefore help to reduce the 
costs. The usage of image processing via machine learning serves as an enabler towards 
the development of such an assistance system, that stores, handles and classifies milling 
tools by the state of their tool wear. 

In recent years methods of deep learning have proven to be able to detect tool wear 
based on indirect and direct approaches. For the direct approach, which classifies the tool 
wear using images of the tools, state of the art CNN-architectures that perform well on 
the ImageNet dataset, such as VGG and ResNet50 have proven to be able to detect tool 
wear. The VGG-16, VGG-19 ResNet50 and EfficientNet_bO model were trained based 
on the created dataset, which consists of 1640 images based on 41 different milling tools 
that were classified as worn or not worn by experts. The usage of the weights based on 
the ImageNet dataset significantly boosted the performance of every model. The Effi- 
cientNet_bO model with the use of the ImageNet weights performed best with an accu- 
racy of 91.47%. The model outperforms human machine operators in classifying the 
wear on milling tools by 22.87%. 

An image-based assistance system that helps machine operators in classifying the 
wear on milling tools could decrease production costs, since a larger proportion of the 
possible tool life expectancy could be used. Furthermore, the usage of worn tools would 
become less likely by using such an assistance system, which leads to an increase in 
product quality. 

To achieve further improvements in detection performance, the dataset should first be 
enlarged. Furthermore, the dataset contains an inherent bias, since the data was labeled 
by a human expert. This bias could be removed by classifying the samples based on 
measured wear phenomena in their geometry. 

A comparable approach could be used to assess the wear of turning tools or in opti- 
cal quality assurance. The approach is particularly suitable in areas where there are no 
clearly defined boundaries between the classification results, which means that classic, 
analytical approaches cannot be used. 
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Abstract 


Increasing product variety, shorter product life cycles, and the ongoing transition 
towards electro-mobility demand higher flexibility in automotive production. Espe- 
cially in the final assembly, where most variant-dependent processes are happening, 
the currently predominant concept of flowing line assembly is already been pushed 
to its flexibility limits. Line-less assembly systems break up the rigid line structures 
by enabling higher routing and operational flexibility using individual product routes 
that are takt-time independent. Hybrid approaches consider the combination of line 
and matrix-structured systems to increase flexibility while maintaining existing struc- 
tures. Such system changes require a high planning effort and investment costs. For 
a risk-minimized potential evaluation, discrete-event simulation is a promising tool. 
However, the challenge is to model the existing line assembly concept and line-less 
assembly for comparison. In this work, a comprehensive scenario analysis based on 
real assembly system data is conducted to evaluate the potential of line-less assembly 
in the automotive industry. Within the simulation, an online scheduling algorithm for 
adaptive routing and sequencing is used. Based on an automated experiment design, 
several system parameters are varied full-factorially and applied to different system 
configurations. Various scenarios considering worker capabilities, station failures, 
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material availability, and product variants are simulated in a discrete-event simulation 
considering realistic assumptions. Results show that the throughput and utilization 
can be increased in the hybrid and line-less systems when assuming that the stations 
will have failures and the assumption of an unchanged order input. 
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Flexibilisation 


1 Introduction 


Production systems are evolving towards smart, cognitive, and more adaptable systems 
to cope with new global challenges [1]. Especially in the automotive sector, assembly 
systems face several drivers for flexibilisation: The shift of strategy from centralized 
production to decentralized plants to cope with local demands leads to a reduction in 
production volumes. Therefore, frequent adjustments of production lines are necessary, 
implying high investment costs per unit for inflexible systems [2]. Due to shorter product 
life cycles, production systems must adapt to new or expanded product portfolios [3]. 
Specifically, as the demand for electric cars is rising but not high enough to operate indi- 
vidual production lines economically, multi-model lines are used. Moreover, the higher 
frequency of production start-ups demands a more efficient ramp-up enabled by flexible 
systems [4]. Additionally, unexpected global pandemic events and political restrictions 
lead to material shortages resulting in long downtimes when using inflexible assembly 
systems [5, 6]. In summary, existing conventional automotive production systems do not 
meet these new requirements [7]. 

Therefore, in the past, various systems have been developed for flexibilization in the 
automotive industry [8]. Enablers for flexible systems are flexible transport systems 
(e.g., AGVs), highly qualified workers, and modular, reconfigurable stations [1]. Exist- 
ing flexibility levels (e.g., line-less assembly, agile hybrid assembly systems) differ in 
the number of relaxed restrictions and applied flexibility [9]. The decision for one of 
these flexibility levels demands assessing potential investments, benefits and risks. In 
this context, one must take both the green-field planning to evaluate production systems 
to be newly designed and the estimation of a conversion potential for existing systems 
into account. For such modeling and planning, especially in the automotive industry, the 
method of discrete event-driven simulation (DES) is an established tool [10]. So far, no 
method or tool exists to derive the potential of different flexibility levels and systems 
in an application-specific and automated way. In order to conduct comprehensive analy- 
ses, standardised models are required for the low-effort investigation of different models 
on the basis of common system variables. Therefore, this paper presents a methodol- 
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ogy for an automated simulation-based potential analysis of different flexibility levels in 
the automotive sector. Following, an application of this methodology to an automotive 
industry use case and the derivation of relevant potentials is presented. 


2 State of the Art: Simulation Studies of Assembly 
Flexibilisation 


To model an automotive final assembly, special requirements have to be taken into 
account. For example, there are a large number of stations in which a combination of 
manual and automated processes are performed. The predominant organisation form of 
takt line assembly is characterised by a continuous product flow and a strongly planned 
takt. The flexibilisation of product- or system-side flexibility through simulation studies 
has been investigated in some publications with a different focus. 

In particular, the impact of routing flexibility on the performance of manufacturing 
systems is often simulated with discrete-event simulation to evaluate the effects of flexi- 
bilization [11-15]. Routing flexibility can be defined as “the ability of a manufacturing 
system to produce a part by alternate routes through the system” and can be achieved by 
having multi-purpose stations and allowing individual product routes. On the other hand, 
operation flexibility describes “the ability of a part to be produced in a different way” 
and thus refers to product-side flexibility [16, 17]. 

While most publications do not specifically refer to line-less assembly systems in 
comparison to line assembly systems, Hofmann et al. [18] examine the effect of rout- 
ing and operation flexibility in so-called matrix productions and production lines The 
resulting flexibility levels are simulated for different levels of failure probabilities and 
evaluated as the throughput time, tardiness output and utilization. The result shows that 
especially long downtimes are a motivator for matrix production, as adherence to sched- 
ules is significantly better in this case. However, only 10 work stations are included in 
the evaluated system that is not based on industrial data. 

Schönemann et al. [19] also compare matrix with line production in discrete-event 
simulation and investigate the influence of buffer sizes and machine failures. The results 
show that the matrix system can achieve a higher utilisation due to the redundancy of 
work stations and the used adaptive control strategy. Only the given scenario with eight 
work packages is investigated for the two designs and no intermediate levels of flexibil- 
ity are considered. 

Göppert et al. [20] use an automated scenario analysis to investigate in a high number 
of scenarios the influence of operation and routing flexibility on the flow time in line-less 
assembly systems compared to line assembly systems for different station failures and 
interarrival times. Results show that especially operation flexibility can compensate sta- 
tion failures and bottlenecks due to low interarrival times. The evaluated system consists 
of 8 work stations and is not based on industrial data. 
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Küpper et al. [21] define the concept of flexible-cell manufacturing, in which workers 
are assigned to individual matrix-structured work stations, which are divided into spe- 
cialised and generalised cells. In a simulation study, the final assembly for a real automo- 
tive use case is modelled for the existing line concept and as a flexible-cell concept. The 
focus of the evaluation is on worker utilisation, which is increased by 12% in a shift to 
flexible-cell manufacturing. 

In conclusion, the number of stations and processes to be executed is not sufficiently 
taken into account in the publications presented and only Schönemann et al. [19], Hof- 
mann etal. [18], Göppert et al. [20] and Küpper et al. [21] compare a flexible pro- 
duction system with takt line production. Just Kiipper et al. [21] refer specifically to 
the requirements of the automotive industry and use real data to map the complexity. 
Here, however, only the contrasts of a line production versus a flexible cell production 
are modelled without considering hybrid forms. Therefore, it requires the investigation 
of hybrid line-less assembly systems compared to classical line assembly and line-less 
assembly based on real industry data for automotive final assembly. 


3 Automated Scenario Analysis for Modelling of Takt Line 
and Line-Less Assembly Systems 


To assess the potential of flexibilization in automotive assembly, a simulation-based 
automated scenario analysis is applied: Based on standardized input data (see Fig. 1), a 
full-factorial experimental plan of simulations to be performed is generated. The input 
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Fig. 1 Automated scenario analysis to generate full-factorial simulation experiments to be on dif- 
ferent flexibility levels 
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data is based on industry data e.g., MTM times, building plans and error protocols. It 
contains information about the layout, the stations, operations, products, transport sys- 
tems, and general simulation information. A normal distribution for the mean time 
between failure (MTBF) and a uniform distribution for mean time to repair (MTTR) 
is assigned to each station for breakdown simulation. A random seed repeats stochas- 
tic influences and avoids outliers. The simulations to be executed differ concerning the 
parameter combinations of the input parameters and the applied assembly system flex- 
ibility levels. Three flexibility levels are considered: 


1. Line assembly: Assembly stations are linked to each other with a continuous product 
flow. Therefore, no pure transport time occurs, as work is performed during transport. 
The process steps to be performed in parallel at one station are assigned to a station 
in advance. It is not possible to swap process sequences as the order is fixed. There is 
no re-sequencing by overtaking or diverting individual vehicles. The product always 
remains in a station for the given takt time, the process times are no longer than the 
length of a takt time. The line is divided into sections with buffer spaces in between. 

2. Hybrid assembly: This concept is a combination of line and line-less assembly. Sta- 
tions are grouped into so-called clusters. The division can, but does not have to, be 
based on line assembly sections. Within a cluster, all stations can perform all pro- 
cess steps of the cluster. Therefore all human workers at a station are qualified to 
perform the necessary process steps. By using flexible transport systems (e.g. AGV), 
station and process sequences can be adapted. However, the order of the clusters to 
each other is inflexible, so restrictions by the assembly priority graph are considered. 
Process times at a station result from the station-specific longest process time to per- 
formed manually or the automatic process time. There is no general takt time. In that 
way, the clusters are arranged in a line concept but within one cluster, the stations and 
procedures are based on line-less assembly. 

3. Line-less assembly: The assembly sequence’s time and place restrictions are repealed. 
Each process step can be carried out at each station, and process sequences can be 
selected individually for each product. The only exceptions are special stations (e.g. 
automatic stations or stations with a special tool) that perform processes that other 
stations cannot take over. However, these stations can also be used flexibly in the pro- 
cess sequence. Process times at a station result from the station-specific longest pro- 
cess time to be carried out manually or from the automatic process time. There is no 
general takt time. 


The discrete-event simulation models are generated and executed in the software Tecno- 
matix Plant Simulation [22]. A greedy algorithm controls the online routing and schedul- 
ing decisions during the simulation. To minimize the assembly makespan, the decision 
on the next process step and execution location is made on a product-individual basis, 
considering the current queue lengths, transport times, and station failures. The result of 
the automated scenario analysis is a detailed statement of relevant KPIs (e.g. makespan, 
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output quantity, utilization, waiting times) for each scenario. The comparison of all sce- 
narios allows for conclusions about the potential of different flexibility levels in automo- 
tive assembly. 


4 Modelling of an Industrial Use Case 


The methodology is applied to an industrial use case with data taken from an automo- 
tive OEMs final assembly. Six product types are assembled in the system. These include 
three car types, each of which is built in two variants. Therefore, each vehicle type is 
modelled once in a minimum and in full equipment, whereby all assembly-relevant 
features are taken into account. The order amount of product types are assumed to be 
equally distributed and the sequence is randomly generated. 

In all flexibility levels, a new order is released into the system during the given takt 
time if one required station is free. This ensures comparability, as the product input and 
its quantity is the same in all systems. Each simulation is run with five random seeds 
to compensate for the statistical influences. The system is first loaded for eight hours 
and then the statistics are gathered in order to exclude ramp-up effects. After the ramp- 
up, another eight hours of production time are simulated. The transportation between 
the stations is done by AGVs for the hybrid and line-less system and is simulated with 
AGV routing based on distance and velocity but without traffic. In the takt line system 
no transportation time is considered due to the continuous product flow. 

The considered part of the final line assembly consists of about 100 stations which 
are modelled while the number of stations remains the same for all system configura- 
tions. The process times are recorded in detail based on MTM data and assigned to the 
stations. The processes at a station sometimes take longer than the specified takt time, 
since in the real system compensation is achieved by principles such as workers who 
jump between stations or overlapping processes into the next station. Most processes are 
manual and can therefore theoretically be executed at any place. However, some pro- 
cesses are automated using robotics or need special tooling equipment that is only avail- 
able at one specific station. This results in the stations being divided into generalized and 
special stations that are considered in the hybrid and line-less approach. Each station is 
predetermined by the same availability (between 98-100% to cover a broad but realistic 
field in the industry) and therefore randomly fails with an MTTR value of 5-30 min. 
assumed to be uniformly distributed. The type of failure is not further specified, so it can 
be of a technical or organisational character and may also mean an absence of material. 
The modelled flexibility levels are displayed in Fig. 2. 

Since there is a continuous product flow in the real takt line assembly, each vehicle 
must remain in the station for the length of the takt time, even when the processing time 
is over. In case of station failure, the products have to wait in the stations behind the 
affected station, but upstream processes continue to run since no rigid conveyor belt but 
AGVs are used in the current system as flexible means of transport. The line is divided 
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Fig. 2 Overview of the modelled flexibility levels 


into sections, between which four buffer positions are available for decoupling. The pro- 
cesses are uniquely assigned to stations and follow the predefined precedence graph. 

In the first version of the hybrid assembly system, clusters are formed based on the 
existing sections in the real line assembly (Cluster 1). A cluster always consists of the 
number of stations that are in a section and within this, the capabilities are combined 
and the process sequence restriction is dissolved but the general sequence of the sections 
remains. In the second cluster, two sections are always combined, making the matrix 
areas larger, which is also associated with increased employee skills (Cluster 2). 

In the line-less assembly system, all stations of the real system are transferred to a 
matrix layout and the restrictions are resolved. In one version, the process sequence is 
fixed as in the real system and in the second version, the operational flexibility is set to 
100% by dissolving all process restrictions which cannot be transferred to the real sys- 
tem but allows to identify potentials. Manual processes can be executed at all generalised 
stations, while the identified special stations remain (caused by automated processes or 
special handling tools). The number of stations remains the same, the system is not opti- 
mised (e.g. resolution of resulting bottlenecks). 

Since throughput is one of the most relevant KPIs in the automotive industry, the 
effect of station availabilities on this is examined in the following for all system variants. 


5 Discussion on Simulation Results 


For the scenario analysis, the station availability was examined in five levels between 
98-100% for five flexibility levels and the throughput was determined. For this purpose, 
the average of five simulation runs was calculated, resulting in a total of 125 simulations. 
The results for the throughput can be seen in Fig. 3. In an idealised system with 100% 
availability, the Takt Line Assembly performs best, as there is a continuous flow with- 
out interruptions. The utilization is the highest for takt line assembly with 90.9% due to 
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Fig.4 Results on the average station utilization based on five levels of station availabilities for all 
flexibility levels 


process times that are shorter than the takt time (see Fig. 4). In theory, the more flexible 
systems should be able to reach the results of the line at 100% availability, as they could 
reproduce the process and station sequences of the line. The fact that this is not the case 
shows the complexity in the planning and control of the systems and can be attributed to 
the use of a simple greedy algorithm to generate the individual product routes. 

Already at 99.5% station availability, it can be seen that throughput drops sig- 
nificantly in Takt Line Assembly, as a single station failure disrupts the entire product 
flow, while only a slight decrease can be observed in the more flexible alternatives. At 
100% availability, compared to the Takt Line Assembly, the throughput decreases by 
10% when viewing the Hybrid Assembly Cluster 1 and 5% when viewing the Line- 
less Assembly with free operation sequence. However, at 98% availability, throughput 
is increased by 38% in a Hybrid system and as much as 51% for the Line-less sys- 
tem. Increasing the size of the clusters (from 1 to 2) gives an advantage (3% increase 
in throughput at 98% availability), although this comes with increased worker skills. 
Both cluster variants and line-less assembly are more resilient to station failures. With 
regard to the precedence graph, it can be recognised that a resolution of this into clus- 
ters, but especially with free operation sequence, higher throughputs and utilisation can 
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be achieved. In comparison to the Takt Line Assembly, the Line-less Assembly System 
with fixed operation sequence is significantly more resilient to station failures due to the 
system-side flexibility by allowing alternative stations for manual processes. Figure 4 
shows the average utilisation across all stations and the five simulation runs. Taking sta- 
tion failures into account, the utilisation can be significantly increased by flexibilisation 
(e.g. increase by 17.8% from takt line assembly to Hybrid Assembly Cluster 1 for an 
average station availability of 99%). 

As a critical reflection, it is noted that completely free operation sequences in line-less 
assembly systems are unrealistic in reality, but the results show that an investigation of 
the precedence graph in terms of flexibility is worthwhile, as increased operational flex- 
ibility offers great advantages. The assumption for hybrid systems, on the other hand, is 
legitimate, but still needs to be validated by practical tests that could be done by flexibi- 
lising the process sequences and worker tasks for selected line sections without chang- 
ing the general layout. In addition, the floor space requirement is not directly taken into 
account, whereby the number of stations remains the same. For line-less systems, how- 
ever, it requires larger path areas due to increased transport effort. In addition, workers 
were only considered only indirectly; a dedicated worker scheduling system is needed 
for control. Compared with the state of the art, it can be confirmed here on the basis 
of real data, that system- and product-side flexibilisation is worthwhile, especially when 
station breakdowns are taken into account. However, it also shows the complexity in the 
control system with a high number of stations, as the throughput and utilisation of the 
line could not be achieved with full availability. 


6 Conclusion and Outlook 


In conclusion, hybrids and line-less assembly systems show high potential for automo- 
tive final assembly, as proven by extensive simulation studies. Whereas in an idealised 
system, Takt Line Assembly Systems achieves the best values, throughput and capac- 
ity utilisation can be increased in Hybrid and Line-less Assembly Systems when taking 
into account station failures while maintaining the same order volume. While complete 
line-less systems still require a high degree of planning and control effort (e.g. employee 
qualifications), hybrid systems can be implemented more easily. 

As a next step, in the hybrid and line-less systems, bottlenecks can be identified and 
the system be planned better (e.g. identify which automated stations are worth duplicat- 
ing; lower the inter-arrival time of jobs; implement order release strategies based on the 
system status to maximise utilisation). In addition, the takt line assembly can be mod- 
elled even more realistically by taking into account balancing principles, where pro- 
cesses take place across stations. The potential exists here to catch up in a line operated 
by AGVs through increased speeds after disruptions. Also, breaking up the rigid line 
structure into a Hybrid System shows benefits for this use-case. A further object of inves- 
tigation is the integration of new variants into the system. This raises the question of how 
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complex line-less systems deal with constantly new product integrations and integrated 
ramp-ups and how these can be supported by simulation modules. 
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Line-Less Mobile Assembly Systems 
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Abstract 


Volatile markets and production request for assembly systems adaptable to changes 
of product types, production capacity, and product order. Computer-aided decision 
support systems facilitate scheduling, planning, and controlling adaptive and flex- 
ible assembly systems. Formal description models of resources and their capabilities, 
assembly tasks and their requirements are necessary for automated decision-making. 
This paper contributes a conceptual CAPablLity-based resource AllocatioN Ontology 
(CAPILANO). The ontology is tailored as a uniform description of heterogeneous 
assembly resources and their (combined) capabilities, connected to a capability-based 
task allocation approach. The intended application of the resulting framework is the 
identification of suitable assembly resources in Line-less Mobile Assembly Sys- 
tems (LMAS) and their allocation to assembly tasks, based on a unified and formal 
description. To date, ontologies in assembly have been limited to querying resources 
and their capabilities; here, subsequent task allocation is presented as an integral 
component of a tailored framework. The resulting framework consists of a model of 
heterogeneous resources and their capabilities in an ontology created in Protégé in 
OWL, SPARQL-based querying, and a consecutive and availability-aware task allo- 
cation in Python. The development of the ontology-based task allocation framework, 
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including ontology taxonomy, querying and task allocation, is discussed. Its applica- 
bility in LMAS is demonstrated through linear scalability of task allocation and future 
advances are discussed. 


Keyword 


Ontology - Task Allocation - Line-less Mobile Assembly Systems 


1 Station Control in Line-Less Mobile Assembly Systems 


The trend of consumers demanding individualized products requires adaptable and flex- 
ible production and therefore adaptable and flexible assembly systems [1]. The para- 
digm of Line-less Mobile Assembly Systems (LMAS) offers a solution for realizing such 
an assembly system based on the three principles: Mobilized resources, a clean floor 
approach, and dynamic job-routes [2]. To ensure the adaptation of stations in LMAS to 
new tasks, each new task involves checking which resources provide the capability to 
perform the allocated task, which has been done manually beforehand. Therefore, the 
objective of this paper is to provide a framework of the two necessary steps for forma- 
tion planning in a flexible assembly station: a formalized representation of assembly 
resources and their capabilities and a task allocation based on this representation [3, 
4]. Accordingly, the paper aims to answer the research question: How can assembly 
tasks, assembly resources and boundary conditions for automated task allocation be 
described? The resulting framework is intended as a foundation for adaptive assembly 
station planning through feasibility checking, whereby the offered capabilities of the 
resources and the requested requirements of the tasks are matched. 

The remainder of the paper is structured as follows: In Sect. 2, the related work on assem- 
bly resource modeling, capability modeling and task allocation is reviewed with regard to 
applicability for formation planning in LMAS. In Sect. 3, the methodology followed while 
developing the framework is summarized. Section 4 introduces the methodology’s results 
and details the derived conceptual schema of the ontology and the task allocation. A use- 
case-specific implementation on operation level is presented and its performance is evaluated 
in Sect. 5. Finally, the results are concluded and future work is presented in Sect. 6. 


2 Related Work in the Context of Station Planning in LMAS 


LMAS enables dynamic adaption to changing demands through temporal and reconfig- 
urable layouts (formations) of assembly stations. The task-depending and ever-changing 
formation of the heterogeneous and mobilized resources in assembly stations requires 
adaptable formation planning and task allocation [2]. The foundation for automated task 
allocation to assembly resources is a consistent and formal modeling of the resources, 
their capabilities and their taxonomy as a digital representation [4, 5]. Ontologies 
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provide a means of formally representing knowledge by describing instances and their 
relations, and are thus suitable for modeling resources and their capabilities in the manu- 
facturing domain [6]. 

In the following, the imposing requirements on a capability-based resource alloca- 
tion ontology for a station in LMAS are detailed. The ontology has to be scalable, to be 
enhanced with new classes and instances representing newly integrated resources. It has 
to enable queries for assembly resources and their capabilities to allocate resources 
to assembly tasks depending on availability and requested capabilities [4]. Considering 
cooperating resources or changing capabilities depending on equipment and tools, it has 
to provide inheritance of combinational capabilities. For station control, the time-rel- 
evant update of properties (e.g. a resource being idle or not idle) must be included [2]. 
To allow for task allocation the transfer of query results to third-party software has to be 
implemented. In the following existing ontologies and frameworks will be evaluated with 
regard to these requirements. 

The Product Resource Order Staff Architecture (PROSA) provides one of the first 
semantic representations intended for smart manufacturing. However, it leaves matching 
capabilities to requirements for future research [7]. MANDATE defines an International 
Standard for representing manufacturing management data, including the product, pro- 
cess and resource paradigm [6]. The “Referenzarchitekturmodell Industrie 4.0 (RAMI 
4.0)” for information systems defines a reference architecture of technical assets and 
their relevant aspects throughout their entire life cycle [3]. 

With MAnufacturing’s Semantics Ontology (MASON), a semantic net as an upper- 
level ontology for manufacturing was presented, including entities, operations and 
resources, but lacking a representation of capabilities [8]. In the BaSys 4.0 ontology, 
modular resources provide combined basic and slave capabilities orchestrated by mas- 
ter capabilities. BaSys 4.0 includes querying by matching requested and provided capa- 
bilities, but excludes task allocation to individual resources. [9] WésER et al. aim to create 
an upper-level ontology (C41) enabling matching of provided capabilities of resources 
and the required capabilities to fulfill the task. They define capabilities as a hardware- 
agnostic representation of the resources’ functionalities, consisting of sub-capabilities. 
C4I lacks the means of allocating tasks to individual resources. [5] In the Manufactur- 
ing Resource Capability Ontology (MaRCO), the four classes of product, process, capa- 
bility and resource are differentiated. Combined capabilities are modeled in an analog 
approach to C4I: The combined capabilities result from cooperating resources or a com- 
bination of resources to an aggregated resource [10]. A wide-ranging review of ontolo- 
gies intended for robotic utilization can be found in [11]. 

Currently, ontologies can be queried for matching the task’s requested capabilities to 
resources’ provided capabilities, resulting in a list of individual or combined resources 
that provide the requested capability. For formation planning, it is necessary to allocate 
one specific resource to one specific task. No ontology fulfilling all stated requirements 
is publically accessible to derive an ontology for task allocation for stations in LMAS. 
Currently, the planning of formations of mobile resources in LMAS takes place manually. 
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Due to the manual process, reconfigurable stations in LMAS are still inefficient for indus- 
trial applications, especially for prototype production and lot size one. 


3 Methodology and Foundations 


This research aims at developing an ontology-based task allocation framework as a foun- 
dation for station planning in LMAS. Varying methods of creating such a framework can 
be found in the literature. For building the domain ontology, the broadly accepted seven- 
step procedure according to Noy et al. [12] was followed due to its application-oriented 
structure. Moreover, the first two steps (definition phase: developing an ontology and 
modeling phase: modeling of a use-case) of the digital twin pipeline of Gopperr et al. 
were applied [13]. Finally, validation and verification are carried out following SARGENT 
et al. and GÖMEZ-PEREZ through application ontology and performance testing [14, 15]. 

The conceptual ontology is built in Protégé using the Web Ontology Language OWL. 
OWL is a machine-readable knowledge representation language that enables the deriva- 
tion of implicit knowledge from explicitly defined knowledge by reasoning systems [9]. 
OWL allows for reasoning based on semantic and syntactic rules, thus being formal and 
allowing for capabilities to be inherited from one instance to another and composing of 
capabilities of other capabilities [12]. 


4 Capability-Based Resource Allocation Ontology CAPILANO 


In the following the first phase of ontology-based modeling, the definition phase, accord- 
ing to GÖPPERT et al. [13], is described and the conceptual CAPablLity-based resource 
AllocatioN Ontology (CAPILANO) is presented. The broadly applied concept of divid- 
ing assembly systems into product, process and resource of Martin et al. was followed. 
We focus on the resources and their allocation to tasks through capability-matching, 
defining a process as a set of tasks [11, 16]. 

The resource class consists of the heterogeneous individual resources (class objects) 
and the associated capabilities of the individuals. Table 1 provides an example of the 
individual FASIMA_ABB4600 of the resource class “ABB_4600” and its parameters. 
According to RAMI 4.0, resources are assumed to be an entity, i.e., a uniquely identifi- 
able, represented, and known asset [3]. Consistent with the definitions of RAMI 4.0 [3] 
and PROSA [7], the resources follow the definition of an asset or holon. Thus they can 
be delimited individually but may also be composed of other resources. Resources have 
a defined boundary, can be composed of other identifiable resources, can be combined 
to form resources and be assigned a value and a purpose. [3, 7] The utilization of indi- 
vidual parameters was adapted from MASON [8], PROSA [7] and PMK [17]. The actual 
static and dynamic parameters were adjusted from MASON [8]. The interaction of these 
parameters with the capabilities was adapted from PROSA [7] and PMK [17]. 
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Table 1 Example of the robot resource class ‘ABB_4600’ of CAPILANO 


Individual 
Parameters 


FASIMA_ABB4600 


minValue |maxValue Capability_Required Capability_Inherited 
Robot_Payload XSD: Double |grams 9999999999 | Position EndEffector 
Robot_Deceleration XSD: Decimal |m/s^2 999,999 | Position N/A 
Endbffector NA 
NIA Spec 


Following KLUGE, we assume capabilities to be describable by referring to the ele- 
mental assembly operations defined in standards and guidelines such as VDI 2860 and 
DIN 8580, 8582, 8588, 8592 and 8593 [18]. Here we define capability as a hardware- 
agnostic means of fulfilling a function. Capabilities are defined as classes and are 
assigned to the resources through the class restrictions adapted from MaRCO and C4 [5, 
11]. To model the resources’ capabilities, the functional methodology from MaRCO [11] 
and specifications of the VDI 2860 are adapted, inheriting the concept of combinatory 
capabilities. In contrast to MaRCO, parameters like ‘Payload’ are assigned to the capa- 
bility class instead of the resource class, allowing for easier adaption through changing 
the individual itself instead of an entire class. Simple capabilities are directly assigned 
to the individual resources, therefore represented as individual parameters. Complex 
capabilities consist of multiple simple capabilities, as visualized in Fig. 1. Depending 
on the related capabilities, a resource inherits these parameters [7]. For example, if the 
individual ‘Gripper’ has the individual parameter “GrippingForce’, the ‘Gripping’ capa- 
bility inherits this parameter. The combined capability combines the capabilities and 
the related individual parameters resulting in the capability parameter. For example, the 
resource ‘mobile manipulator’ consists of the resource ‘robot’ and ‘AGV’ and inherits the 
following capabilities: ‘EndEffector’, ‘Positioning’, ‘Transporting’ and assuming the end 
effector ‘ScrewDriver’ e.g. ‘Screwing’. The inheritance process itself is adapted from the 
MASON ontology [8]. 

Figure 2 presents CAPILANO as the main result of the definition phase, depicted 
with the related ontologies. Compliant with the development phase defined in GÖPPERT 
et al., elements of existing ontologies were inherited, adapted and extended. 


Fig. 1 Combined capabilities 
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5 Task Allocation Framework 


The conceptual framework for task allocation is depicted in Fig. 3. The framework con- 
sists of a means of storing and retrieving data (data lake), converting this data into a tai- 
lored schema (data conversion), querying the data in the ontology (here: CAPILANO) 
followed by allocation of the querying results, and converting the results back into a stor- 
able data format and thus closing the circle to the data lake. Realizing the framework’s 
goal to match the task’s requirements to the resources’ capabilities, the process of query- 
ing and consecutive task allocation is detailed below. The resulting framework can be 
found under: https://github.com/A Kluge Wilkes/IoP-CAPILANO. 

In a pre-processing stage, CAPILANO, including the resources and provided capa- 
bilities, is extracted. Moreover, the input process chart is retrieved from the data lake 
and converted into a custom schema adapted from MaRCO [11] and the C4I metamodel 
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[5]. The process chart contains the required capabilities to perform the tasks and the task 
order. 

During the processing step, querying and task allocation are carried out. For que- 
rying, the requested list of capabilities of the process chart is converted in a SPARQL 
query by the Python-based Cython Phaser. Figure 4 visualizes one query loop for the 
capabilities of ‘Screwing’, ‘Positioning’ and ‘Transporting’. The SPARQL queries are 
forwarded to the JAVA-based HermiT Reasoner and the output is cached by checking the 
compatibility of capabilities and inferring implicit capabilities [19]. At first, individuals, 
which provide the requested capability are identified (e.g. “ScrewDriver_1’ and ‘Screw- 
Driver_2’ for ‘Screwing’), then the HermiT Reasoner checks for combinable capabilities 
of the resources, e.g., if a capability is required, which could be provided by a specific 
robot in combination with a particular end-effector (here: ‘ScrewDriver_1’ and ‘Mobile- 
Manipulator’ are combinable through ‘EndEffector Type 01’). This function facilitates 
the combinatory capability inheritance requirement. If several resources match the capa- 
bilities, they are chosen in descending order, thus in the first query loop, the one with 
the closest matching parameter is selected (e.g. a payload of 20 kg is requested a grip- 
per providing a payload of 25 kg would be preferred over one providing 40 kg), repre- 
sented in Fig. 4 by a white square. The loop continues until all possible combinations of 
resources are listed in descending order. Once the Cython Phaser processed all queries, 
the cached results are converted into CSV/TSV and forwarded to the task allocation step. 
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Fig.4 Querying for ‘Screwing’, ‘Positioning and ‘Transporting’ in CAPILANO 


60 A. Kluge-Wilkes et al. 


(Depending on the interchangeable third-party system carrying out the task allocation, 
this conversion could be adapted, providing compatibility to other programs.) During 
task allocation, the best possible resource is selected. The ‘best’ is currently defined by 
the criteria 1) necessary equipment is already equipped on the robot 2) highest battery 
charge 3) the first item of query list. 

During post-processing, the list of allocated resources and tasks is converted to be 
OWL readable and the results are integrated and updated in CAPILANO. 


6 Evaluation and Results 


According to GÖMEZ-P£rEZ [15], the fulfillment of the evaluation criteria consistency, 
completeness and expandability were investigated to validate and verify CAPILANO. 
Consistency was proven by deriving the inferred hierarchy of the asserted hierarchy 
of CAPILANO with the HermiT Reasoner and applying the ROMEO (Requirements- 
oriented methodology for evaluating ontologies) methodology [20]. To verify the cor- 
rect implementation and programming of the conceptual ontology (computerized model 
verification), the ontology taxonomy evaluation and a comparison of the asserted and 
inferred hierarchy were applied [14]. 

An application ontology was implemented to validate the framework within its 
intended scope and determine whether its output behavior provides an acceptable accu- 
racy [14]. Use-case-specific instances are modeled according to the conceptual ontology, 
creating a knowledge base/description model on an operational level, consistent with [8, 
13] and [12]. The process of truck chassis assembly is used as an application scenario: 
The parts ‘cross member’, ‘front member’ and ‘rear member’ have to be transported from 
storage to the chassis and have to be screwed onto ‘chassis’. To fulfill this process, the 
capabilities ‘Screwing’, ‘Transporting’ and ‘Positioning’ are requested with differing 
property parameters of acceleration, velocity, jerk, etc. As resources, several stationary 
robots (ABB_4600, ABB_2600), mobile robots (Kairos) and equipment (gripper, screw- 
driver) are available, which provide the requested capabilities in varying resource combi- 
nations. 

Based on the application ontology, the framework’s performance is analyzed by meas- 
uring the time to process a task allocation. Twenty unique queries with varying degrees 
of complexity were created and used as the seed for a randomizer to generate processes 
charts. Five process charts with the same number of queries, but unique randomized que- 
ries are processed for each data point. The run-time of these five charts is averaged to 
ensure uniform distribution of complexity within the five process charts. The queries and 
process charts can be found here: https://github.com/AKlugeWilkes/IoP-CAPILANO/ 
tree/main/03_Evaluation. 

The graph “Average runtime vs. number of queries” a) in Fig. 5 presents the scaling 
performance of the ontology. The querying was carried out for 5, 25, 50, 100, 250, 500, 
1000, 1500 and then every 1500 queries. Based on the test data, it shows linear scal- 
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Fig.5 Runtime analysis depending on the number of queries 


ing behavior. The graph “Runtime per query vs. number of queries” b) in Fig. 5 along 
with an R-Squared trendline, presents an upward trend, representing a non-linear behav- 
ior. Linear behavior is obtained when normalized to an error percentage of 0.8385. It is 
concluded that the developed ontology has a linear scaling behavior within a margin of 
0.8385%. 

It was shown that the developed framework is scalable as one can integrate new enti- 
ties for application and inherit other ontologies. As visualized in the figures above query- 
ing and matching resources to tasks based on the required and provided capabilities was 
realized through SPARQL querying and Python-based allocation. Capabilities resulting 
from combining resources to a new one can be inherited from one instance to another. A 
transfer of query results to a third-party software to enable task allocation was exemplary 
realized by developing interfaces allowing for a transfer in a Python program and can be 
adapted for other third party software. 


7 Conclusion and Outlook 


This paper contributes the ontology CAPILANO. CAPILANO formally describes 
assembly resources and their combined capabilities as a function of equipment using 
the Web Ontology Language (OWL). Based on CAPILANO, a framework matching 
resource capabilities with task requirements and subsequent availability-aware task allo- 
cation was developed. Compared to the manual allocation of tasks to resources, auto- 
matic allocation requires less time and provides reproducible results. In conclusion, the 
developed framework supports the planning of mobile assembly stations by displaying 
possible combinations of resources and allocations to assembly tasks, reducing the time 
required for task allocation compared to manual allocation. 

In future research, the framework will be extended by investigating the spatial and 
temporal requirements of a feasible formation. Spatial reachability and manipulability 
of the allocated tasks as well as collision avoidance of the allocated resources will be 
researched. The task allocation applied will be extended by incorporating criteria like 
proximity of the resource and the allocated task pose, to optimize production time. 
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To explicitly integrate a higher degree of detail of the implicitly existing knowledge 


of humans necessary for automated assembly planning and the subsequent assembly 
execution into the ontology, the ontology can be enhanced by additional parameters. For 
example, parameters such as the wear and tear of resources over time or the considera- 
tion of measurement systems for localization on a map can be added. 
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Abstract 


Takt work represents a significant risk factor for the development of musculoskel- 
etal complaints and diseases, especially in short-cycle processes. The increased risk 
results primarily from a permanent uniform load on the musculoskeletal system. Stud- 
ies on motor variability suggest that an increase in load variation can have positive 
effects on reducing the risk. 

The research project “Integration of activity-specific load changes to reduce phys- 
ical stress during takt work” aims to demonstrate the increase in load variation by 
introducing specific load changes during takt work as a possible means of preventing 
musculoskeletal disorders without causing negative effects on productivity. For this 
purpose, a pilot study was already carried out with ten subjects, which is presented in 
more detail in this paper. 

As foundation for the description of this study, the given paper first provides 
background on the applied theoretical concepts as well as the design of the overall 
research project. This is followed by the presentation of the experimental proce- 
dure and the results of the pilot study on cyclic assembly. Based on the stress pro- 
files determined via surface electromyography the sequence of the analysed reference 


S. Jansing (È<) - C. Rieger - T. Jabs - J. Deuse 
Institute of Production Systems, TU Dortmund University, Dortmund, Germany 
e-mail: steffen.jansing@tu-dortmund.de 


F. Wagenblast - R. Seibt - J. Gabriel - J. Spieler - M. Rieger - B. Steinhilber 
Institute of Occupational and Social Medicine and Health Services Research, 
University Hospital Tiibingen, Tiibingen, Germany 


© The Author(s) 2023 65 
T. Schüppstuhl et al. (eds.), Annals of Scientific Society for Assembly, Handling and 
Industrial Robotics 2022, https://doi.org/10.1007/978-3-03 1-10071-0_6 


66 S. Jansing et al. 


assembly process is reconfigured in order to integrate load changes. Future investiga- 
tions within the research project are planned to compare both processes in terms of 
risk surrogate parameters for musculoskeletal disorders. 


Keywords 


Takt work - Cyclic assembly - Manual assembly - Musculoskeletal disorders - Load 
alternation 


1 Introduction 


A large proportion of all employees in manufacturing companies in Germany is involved 
in takt-based work. The main reasons for this widespread use are the advantages of 
increased transparency in production processes as well as increased productivity and 
reduced training time for employees [1]. Although a cycle time of about one minute has 
become typical in many manufacturing companies [2], a continuous reduction in the 
amount of work per cycle with a concomitant decrease in cycle time is discernible. Thus, 
42% of all production processes in industrialised countries have a cycle time of less than 
1.5 min and 26% of all processes have a cycle time of less than 30 s [3]. 

On the other hand, takt work is a significant risk factor for the development of various 
musculoskeletal disorders (MSDs) and complaints (MSCs) due to frequent repetitive and 
uniform movements [4, 5]. This also manifests itself in employees’ absenteeism from 
work. In 2020 for example, 26,8% of all days of incapacity to work among employees 
in the manufacturing sector were attributable to MSDs, resulting in a loss of gross value 
added of 10.6 billion Euro [6]. Since a shift away from takt work is unlikely due to its 
widespread use, there is a need for new approaches for work design to adapt the form of 
takt work to the employee. 

Both in industrial practice and in science, load alternation is proposed as a possible 
means of reducing physical stress. The reason for this is the assumption that the same 
motor units and associated muscle fibres are generally always activated and stressed 
when performing uniform activities. This can lead to overload or even degeneration of 
individual muscle fibres [7]. It is assumed that load changes protect individual motor 
units from such overload situations [8]. Furthermore, it is assumed that a greater varia- 
tion in load contributes to a relief of motor units [9, 10]. However, there is a lack of evi- 
dence for the targeted use of this approach, which is why this is focused within the given 
research project. For the investigations presented in this paper load changes are therefore 
defined as targeted relief or different types of loads on muscles between activity seg- 
ments. This publication is primarily concerned with the first sections of the project: the 
definition of exemplary assembly processes with the help of a pilot study. 


Exploratory Pilot Study for the Integration of Task-Specific ... 67 


Before providing details on the pilot study as well as the overarching research pro- 
ject necessary theoretical background is outlined in the following chapter. This includes 
the characterisation of repetitive activities as well as the definition of load and stress. In 
addition, the state of the art in recording physiological stress is also presented below. 


2 State of the Art 


Repetitive activities are not clearly defined in the literature. However, they are unani- 
mously described as activities that continuously stress the same muscle groups, tendons, 
etc. in a short time sequence and are performed over a period of at least 60 min [11, 12]. 
Loads are defined according to [13] as external conditions and demands in a system that 
affect the physiological or psychological stress of a person, whereby the objectivity of 
the load is a central characteristic [14, 15]. The internal reaction resulting from the load, 
which is individual for each person, is referred to as stress and depends on the person's 
individual characteristics [13]. 

Both subjective and objective methods are available for measuring stress. The subjec- 
tive techniques include not only the questioning of perceived exertion, e.g. via the Borg 
scale [16], but also the description of physical complaints or self-assessment via stand- 
ardised questionnaires (e.g. self-state scale [17] and NASA task load index [18]). Com- 
pared to objective methods, subjective ones are particularly disadvantageous because of 
their low resolution and the fact that they can be influenced at will [19]. In addition to 
the evaluation of e.g. produced quantities and number of errors to determine the work 
performance [20] physiological methods for determining stress represent the core of 
objective methods. 

As prominent physiological method, surface electromyography (SEMG) allows to 
measure the electrical muscular activity noninvasively. Through the person- and muscle- 
specific normalisation of the amplitude parameter RMS (root mean square) calculated 
from the SEMG signal during physical work in relation to the RMS during maximum 
voluntary activation (%MVE), work-related muscular stress can be characterised and 
attributed to a wide range of work activities [21]. An often used characterisation method 
in ergonomic research is the amplitude probability distribution function with the 10th 
percentile, the median and the 90th percentile as MSC risk indicators [22]. In cyclic 
assembly work, SEMG assessment therefore indicates muscle-specific peaks or pro- 
longed episodes of muscle stress that would be of interest for a redesign [23]. 

For ergonomic design, a reduction in muscular stress during physically demanding 
work activities is considered positive [24]. A correlation with an increased risk of MSCs 
has already been demonstrated, particularly for static muscle stress indicated by the 10" 
percentile [25]. In addition, other risk surrogate parameters can be calculated from the 
SEMG measurement. For example, the number of muscle activities [26] and the relative 
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total duration of activities below 0.5% of the maximum activation [27] could be associ- 
ated with an increased risk of MSCs in the shoulder-neck region [28]. Similarly, a low 
degree of cycle-dependent standard deviation of muscular activation (motor variability) 
[29], is thought to be associated with the development of MSCs and MSDs [30]. 


3 Basics of the Project 


In the research project “Integration of activity-specific load changes to reduce physical 
stress during takt work”, the aim is to provide conceptual proof of the positive effect of 
load changes and an increase in load variation during takt work on the aforementioned 
risk surrogate parameters for MSCs and MSDs. After introducing the overall study 
design in this chapter, the focus of this paper is on the preliminary pilot study that was 
already carried out, including the presentation of the findings in Chap. 4. 


3.1 Project Design 


The research project is divided into four subsequent parts. In the first work package 
(WP), an assembly process is defined, which serves as a reference for the entire study 
and which fulfils the essential characteristics of a cyclic, manual work system. In the 
subsequent second WP, the reference process is carried out by ten experimental subjects 
(half male and half female) in a pilot study. The test persons are equipped with meas- 
urement technology (SEMG) during the execution of the assembly. The muscular stress 
profiles generated based on the measurement are used to subsequently reconfigure the 
chronical sequence of partial activities of the assembly process to integrate load changes. 
In addition, the duration of the study for the main experiment is determined by analysing 
the timing of the increase in stress in the sense of physical complaints, physical exertion 
or signs of muscular fatigue. In the subsequent main experiment, WP three, the refer- 
ence process is compared to the reconfigured assembly process. For this purpose, data 
is collected from 40 test persons (half male and half female) who are randomised, bal- 
anced and blinded to both assembly processes on two different days under laboratory 
conditions. During execution, SEMG data is collected as well as information on execu- 
tion times and errors. In the WP four, a methodological approach is developed based on 
the comparison and evaluation of the results from the main experiment. This approach 
should represent a procedure for industrial practice to classify the muscular stresses and 
loads of partial activities without measurement support. This makes it possible to deter- 
mine an optimised sequence of stresses for the assembly to be carried out. In the case of 
a successful proof of concept (PoC), further studies are required to validate and expand 
the developed method. 
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Due to the planned laboratory studies with experiments on humans, an ethics appli- 
cation with the planned study protocol was prepared at the beginning of the research 
project and submitted to the responsible ethics committee at the Medical Faculty of 
Tübingen. In this protocol, all methods and precautionary measures for the pilot study as 
well as the main experiment are described in detail. The application reveiced a positive 
voting by the ethics committee. 


3.2 Definition of the Reference Process and Work System design 


At the beginning of the study, a manual assembly process was defined as a reference 
process, which fulfils essential characteristics of a cyclic work system. All requirements 
for this process were formulated within the scope of a specification sheet. In addition to 
the conditions defined in the research proposal and results of a literature research, inter- 
views with representatives from industrial practice were fundamental to this. Exemplary 
requirements include a target process duration of 60 s, the integration of ambidextrous 
work and the integration of static as well as dynamic loads. Based on these specifica- 
tions, several drafts for assembly processes were developed, physically implemented in 
the laboratory and evaluated in terms of time and ergonomics. Through close cooperation 
between the research partners, the assembly process could be checked with regard to the 
defined requirements and iteratively adjusted. 

The final reference process consists of 13 partial activities, which are mostly inde- 
pendent in terms of the assembly sequence. Due to the design as a manual assembly pro- 
cess, only two operations require the use of tools in the form of an electric screwdriver. 
Based on the MTM (Methods-Time Measurement) process description, the movements 
of the partial activities range from the Get and Place of larger individual parts to the 
Handle Tool of electric screwdrivers. The individual partial activities can be described 
in terms of the elements of the basic movement cycle Reach, Grasp, Move, Position and 
Release and are composed of various basic operations [31]. Due to the different motion 
lengths, despite a maximum execution time of approx. 3 s for a basic movement cycle, 
a partly dynamic or static strain of the different muscle groups takes place over several 
sequence segments. 

The analysis of the execution time of the process using MTM-UAS (Universal Analy- 
sis System) as a system of predetermined times leads to a standard time of approx. 63 s 
for the assembly process. The ergonomic assessment of the work system by means of 
the ergonomic assessment worksheet (EAWS) results in a medium overall risk and thus 
requires measures for risk control. The defined work system shown in Fig. | is height- 
adjustable and corresponds to the state of the art. The individual components with 
weights ranging from few grams to 2.1 kg are located on a rack in front of the worker. 
They are assembled in housings fixed on trays, which are moved from left to right on a 
roller conveyor. 
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screwdrivers 


Fig. 1 Work system for assembly of components 


4 Pilot Study 


In preparation for the main investigation, a pilot study was conducted to derive the 
experimental setup. The aim of the series of measurements was to create a data basis for 
the reconfiguration of the assembly process and to define the observation period for the 
following main investigation. In addition, the influence of the previous manual experi- 
ence was to be examined as well as the training concept used. 


4.1 Methods and Procedure 


The study population for the pilot study consisted of ten right-handed subjects with an 
average age of 30 years, half women and half men, who had different amounts of expe- 
rience in assembly work and provided written informed consent prior to participation. 
In addition, no limitations in the musculoskeletal system of the upper body and general 
physical health without previous illnesses were defined as inclusion criteria. 

First, the subjects were prepared for the measurement and instructed in the measure- 
ment procedure. Descriptions based on [32] were used for the localisation of the fol- 
lowing muscles of the forearm and shoulder-neck area at the right body side: extensor 
digitorum muscle, flexor digitorum superficialis muscle, infraspinatus muscle, deltoid 
anterior muscle and upper trapezius muscle (also at the left body side). These muscles 
were selected because of their relationships with work-related complaints in repetitive 
work [33]. Then, electrodes were attached according to the SEMIAN recommendations 
[34]. Subsequently, the normalisation procedure was carried out. Therefore, participants 
had to perform three maximum voluntary contractions (MVC) of each targeted upper 
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body muscle (1 min pause in between MVCs) while the muscle activity was recorded. 
In order to achieve a uniform working method with a standardised procedure, the test 
persons were instructed and trained in the assembly process afterwards. The reference 
performance to be achieved for the subsequent measurement to start was defined as the 
execution of two consecutive error-free processes with a permissible deviation of + 10% 
from the standard performance. The four-step method according to REFA, which focuses 
on manual, short-cycle and simply structured tasks with a standardised sequence, was 
used as the training concept [35]. 

During the subsequent assembly, various subjective and objective measurements were 
taken. As a subjective procedure, the test persons were asked before and after the assem- 
bly as well as every 20 min during the assembly about their perceived exertion using 
the Borg scale [36] and their personal discomfort at parts of the upper body [37] using a 
numerical rating scale from 0 to 10. As objective procedures, a SEMG measurement was 
carried out as core element, in addition to determining the execution times and errors. 
With the help of the recorded execution time, a uniform execution speed was determined 
based on the degree of time defined by REFA [35], whereby a deviation of + 10% from 
the standard time was defined as permissible. 

The SEMG signals were sampled at 4096 Hz by a SEMG device (PS12-I, Thumedi 
GmbH & Co. KG, Germany) using a combined data analyser and logger, which calcu- 
lated the root mean square (RMS) from the power spectrum in real-time. For the signal 
normalisation the RMS during the assembly was divided by the maximum RMS dur- 
ing the MVCs and is expressed in percentage [% MVE]. To calculate the average muscle 
activity during each partial activity for each subject, the median of the normalised RMS 
of single partial activities were obtained from five uninterrupted 1 min assembly cycles 
after the first 30 min. 


4.2 Results and Discussion 


The evaluation of the subjective measurement data shows a continuous increase in the 
subjects’ perceived exertion and discomfort over the duration of the exercise. For exam- 
ple, half of all test persons report physical complaints in the area of the neck and the 
right shoulder after 120 min. At that point, three out of ten test persons report complaints 
in the area of the left shoulder, and another four out of ten test persons report complaints 
in the area of the right wrist. Over the same period, an increase in perceived exertion 
from a median of 6.0 to 9.5 with an interquartile range of 3.5 was observed. The evalua- 
tion of the objectively collected measurement data and process-related parameters shows 
that with regard to the degree of time, there is no need to differentiate the test persons 
according to gender and previous manual experience in assembly. A mean time degree of 
98.15% with a standard deviation (SD) of 0.31% was recorded for all participants. Like- 
wise, no remarkable difference can be determined regarding the number of errors with a 
mean value of 0.1 execution errors per cycle (SD =0.06). 
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The evaluation from an occupational health point of view is based in particular on the 
assessment of the stress of the partial activities shown in the SEMG measurements. The 
result is exemplified for the right trapezius muscle in the following Fig. 2. This shows 
the time course for a representative cycle of muscular load. The median as well as the 
10 and 90" percentile of the load are shown for the individual partial activities for all 
subjects. Sections with a high load are highlighted in colour. The illustration shows that 
there is a concentration of highly stressful partial activities in the first section of the 
assembly process and a decrease in the stress level towards the end of the cycle. 

In summary, the pilot study shows the validity of the training concept, and from a pro- 
cess-technical point of view, there is no need to differentiate between subjects based on 
previous assembly experience and gender. The structured training results in homogenous 
cycle times for all study participants, which is necessary to eliminate possible effects of 
different work paces [38, 39]. Additionally, error rates are held at a low level to resemble 
skilled industrial assembly with uniform movements. 

From an occupational health perspective however, the differentiation of male and 
female test persons remains relevant due to different physiological prerequisites [40]. In 
the analysis of muscular stress and perceived complaints, an increase in the frequency of 
complaints and in muscular stress is observed after only two hours. As a result, a meas- 
urement duration of 2.5 h is deemed sufficient for the future main study. Concerning the 
recorded Borg values, the feeling of exertion is generally considered to be low to mod- 
erate in that time span as expected from the EAWS assessment of the work system ex 
ante. Furthermore, the stress based on the SEMG data is in line with previous studies on 
manual assembly [41]. As an extension to existing approaches, different stress levels are 
associated with the single partial activities of the process, which serves as foundation for 
the subsequently described reconfiguration. 
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Fig. 2 Median and percentiles of the stress on the trapezius muscle (right) for all subjects as well 
as exemplary progression 
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5 Reconfiguration 


Based on the stress profile of the reference process determined in the pilot study, the 
reconfiguration is carried out by defining a new sequence. The aim is to achieve the high- 
est possible variation of the partial activities in order to prevent a continuous or uniform 
stress on individual muscle groups. The redefinition of the assembly sequence has to take 
place under the framework conditions of an unchanged temporal and ergonomic evalu- 
ation of the process. The work system is not modified in order to prevent overlapping 
effects. In addition, the realism and practical relevance should be maintained in the rede- 
sign and thus the joining sequence should not be abstracted too much. 

The result of the reconfiguration is based on the stresses per partial activity deter- 
mined in the pilot study and is shown again for the trapezius muscle (right) in the fol- 
lowing Fig. 3. It is visible that in the reconfiguration the partial activities with high stress 
alternate with partial activities with low stress. Additionally, it can be stated that due to 
the limitations of the technically possible assembly sequences and the goal of a constant 
takt time, it is not possible to determine an optimal solution regarding stress. 


6 Outlook 


The presented reconfiguration of the assembly process is the basis for data collection in 
the future main study of the research project. In this study, 40 test persons (half male and 
female) will perform both process sequences on two different days in a blinded, balanced 
and randomised experimental design. This is followed by the processing, evaluation and 
interpretation of the results. The aim of the main study is to prove the positive influence 
of specific load changes on risk surrogate parameters for MSDs and MSCs. In order to 
provide a methodology for the integration of specific load changes in industrial practice, 
a procedure will finally be developed. In case of a successful PoC, further studies will be 
required to validate and expand this methodological approach. 


| partial activity | 1 | 2 pogu 3 [| 12 | 4 | 6 | s | 5s [7] 9 | 
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Fig.3 Stress on the trapezius muscle (right) for the reconfiguration based on the pilot study 
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Abstract 


Augmented Reality-assisted robot programming systems (ARRPS) aim to make the 
programming of industrial robots more efficient by providing an AR-based human 
machine interface that allows operators to program robots intuitively and quickly. 
This work aims to contribute to the field by presenting an input and tracking system 
based on the VIVE Lighthouse technology that can act as a basis for ARRPS sys- 
tems, improving maturity, costs and accessibility. To evaluate the system, ARRPS 
core functionality has been implemented so as to demonstrate its basic feasibility. 
An extensive evaluation of the system accuracy has been conducted, as this is one of 
the key criteria for potential adoption of the technology. The feasibility could be suc- 
cessfully demonstrated and it could be shown that the end-to-end mean absolute error 
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of the robot path point placement amounts to 11 mm in a workspace of 0.6 x 0.6 x 
0.25 m? volume. Finally, the robustness and setup time of the system still need to be 
improved. 
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1 Introduction 


Automation is one of the key technologies for enhancing the efficiency of industrial pro- 
duction processes. Due to the trends of increasingly high number of variants and small 
series, industrial robots need to be programmed more and more frequently. Especially 
Small and Medium Enterprises (SMEs) have difficulties employing automation economi- 
cally because of their small batch sizes and the high costs [1]. This puts cutting down 
the costs of robot programming into the spotlight. Today, the dominating programming 
method in mass production is hybrid programming [2], a two-stage procedure consisting 
of an extensive offline programming (OLP) and simulation step, followed by a relatively 
short online commissioning step using teach-in. New approaches that aim to make robot 
programming more intuitive and less costly include Programming by Demonstration 
(PbD) as well as Augmented Reality-assisted programming systems! [4]. They are not 
as costly and complex as enterprise-grade OLP software suites, yet more efficient and 
intuitive than online programming, using the teach pendant. These approaches address 
production processes which have low to medium complexity and thus do not require 
extensive simulation-based optimization. 

ARRPS have been researched extensively [3, 5-8]. The basic idea is to provide an 
AR-based human machine interface (HMI) for robot programming that overlays useful 
virtual information like robot paths with the real environment and essentially aims to 
replace online programming via the teach pendant. This allows for faster and more intui- 
tive programming. Typically, path points can be programmed via 3D user input and the 
resulting robot motion can be previewed based on a basic simulation. It has been shown 
that efficiency could be increased significantly, when compared to conventional teach- 
in [5]. However, there are still considerable issues with existing concepts including the 
unavailability of mature input methods and high hardware costs. The goal of this paper is 
to present a novel input and tracking system for AR-based robot programming, as a step 
towards mitigating some of the existing deficits. The contributions include: 


'We adopt the term ARRPS coined by the group of Ong et al. as a general term for similar sys- 
tems. [3]. 
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e A flexible and cost-efficient input and tracking system based on the VIVE Lighthouse 
technology, 

e amodular software platform, released as an open-source software package,” 

e ademonstration of the basic feasibility of the system to act as an ARRPS, 

e an examination of the accuracy of the augmentations and the end-to-end accuracy of 
the path point placement that could be achieved with the presented system. 


In Chap. 2 the theoretical background and related research are presented and analyzed. 
Subsequently, in Chap. 3 the concept for the proposed system is detailed, and in Chap. 4 
the implementation is briefly outlined. In Chap. 5 the evaluation is presented and the 
results are thoroughly discussed. The paper concludes with a summary and an outlook on 
future work in Chap. 6. 


2 Related Work 


ARRPS systems [3, 5-8] try to improve robot programming by employing AR as an 
intuitive HMI utilized at the production site. The most comprehensive milestone publica- 
tions are presented subsequently. 

Lambrecht et al. [6] present an ARRPS that consists of a tablet and a Kinect to pro- 
vide gesture input. The components are calibrated to each other using ArUco markers. 
The accuracy of the gesture recognition component is evaluated to be around 6 mm, the 
end-to-end accuracy of the system is not examined. 

Vogl [5] presents a system for robot programming using an infrared-tracked stylus 
for path point placement. The robot trajectory is projected onto workpieces with high 
accuracy using a laser projector. The overall system accuracy is not examined, however, 
based on the components’ accuracies the overall accuracy is likely under 2 mm. 

In their recent work Ong etal. [8] show an ARRPS for welding applications. The 
system is built with the commercial OptiTrack system as the main tracking system. The 
input device is a computer mouse with OptiTrack markers attached to it. The system is 
evaluated with user tests, showing that programming time could be saved and the accu- 
racy necessary for the process could be achieved. 

Analyzing the literature, little focus has gone into examining end-to-end system accu- 
racy of the proposed concepts, although this is a crucial criterion determining which 
range of production processes could be covered. As for the components, the presented 
input devices mostly have low maturity or few functions. The employed tracking systems 
tend to be either very costly or rather inaccurate. Lastly, none of the systems are released 
as open-source software. 

The aforementioned deficits are addressed by designing a novel input and tracking 
system that has low costs, yet reasonable accuracy. The input hardware needs to have 


? https://github.com/MarvinGravert/ViveBasedArrpsPlatform 
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a good level of maturity and a rich function set. The system shall be implemented in 
a modular software platform and released as open-source software, so others can build 
upon it. The system will be evaluated through a demonstration of basic feasibility. Fur- 
thermore, an extensive examination of the system accuracy is going to be conducted. 


3 Concept 


The basis for the proposed system is the Microsoft HoloLens, a modern Optical See- 
Through Head Mounted Display. It is combined with the Lighthouse (LH) system as an 
additional outside-in tracking system which was originally built for VR applications. 
This enables the use of the VIVE controllers and the so-called VIVE Trackers for object 
tracking. The LH system hasn’t been previously used as an input and tracking system for 
ARRPS systems. 

This setup enables natural and ergonomic input using established controller hard- 
ware. The functional space of the controller is rich, multiple buttons and a trackpad can 
be used. Also, diverse gesture input can be implemented using a 3D-tracked controller. 
The used VIVE setup can be considered low-cost when compared to other commercially 
available tracking systems. The LH technology is robust against environmental condi- 
tions, as it uses infrared light. Its mean static accuracy has been determined to be<3 mm 
[9], which is a very good performance in a low-cost system. The recommended work- 
space with two LHs is 3.5 x 3.5 m? but can generally be arbitrarily chosen. The system 
can be extended by adding more LHs, yielding flexibility in terms of setup. On the flip 
side, the mobility of the system is limited because the LHs need to be set up. Placing 
them on movable tripods, combined with their ability to calibrate themselves automati- 
cally, decent mobility of the system is still warranted. 

Figure | shows the system overview, that is, the main components and their interactions. 
The communication of all components is carried out via the central server. The HoloLens 
acts as the main user interface. A distributed software architecture is employed where the 
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Fig.1 System overview 
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HoloLens is the front-end, and the server is the back-end. The LH system complements 
the internal capabilities of the HoloLens. It acts as the central tracking system, tracking the 
controller as well as the HoloLens with a Tracker mounted on it. The LH system sends the 
tracking information, as well as the controller inputs to the server. In order to automatically 
register the virtual AR scene with objects tracked by the LH tracking system, the HoloLens 
is tracked externally. The robot interface exchanges robot motion commands as well as the 
current state of the robot with the server. A Tracker is mounted initially to the robot flange 
to reference the robot with the LH, it can be removed during usage of the system. 


4 Implementation 


Figure 2 shows the system architecture. The server is implemented using a modular micros- 
ervice architecture. It provides essential services to the HoloLens application, namely, the 
tracking hub service, the robot path service and the registration service. The tracking hub 
service collects all the information from the LH tracking system. Additionally, the registra- 
tion service helps with the Lighthouse-HoloLens-registration procedure. The robot path ser- 
vice stores robot paths and controls the robot via the robot interface. Finally, the front-end 
app on the HoloLens manages the application state and provides the user interface. 

The server is run in a Docker container for system-independent and robust operation. 
Internally, the communication is carried out via gRPC. Since the HoloLens does not sup- 
port gRPC, TCP is used for the communication between the server and the HoloLens. 
The server is implemented using the Python programming language, whereas the Holo- 
Lens front-end is designed using the Unity game engine. The latter is rather basic, as 
the focus of this work does not lie on the user interface but rather on the tracking sys- 
tem. The robot-specific robot interface is not implemented, instead the data is transferred 
manually. It could be implemented in the future by using the API of the robot manufac- 
turer or using a common robot interface like ROS (Robot Operating System). The robot 
used is a KUKA KR 6 R900 sixx, and the interface to the LH system is implemented 
with SteamVR. 
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Fig.3 HoloLens-Lighthouse-registration 


In order for the tracking scheme to work in practice, the Tracker needs to be cali- 
brated with respect to the HoloLens it is mounted on, we call this procedure HoloLens- 
Lighthouse-registration. It needs to be carried out once by each user. 

Figure 3 shows a scheme of the transformations, the transformation HoloLens 7, , tracker 
is sought-after. In order to determine it, the user is asked to align a real object (with a 
Tracker mounted onto it) with a virtual copy that he can control through the AR inter- 
face. Based on this, correspondences, which are the same points expressed in two differ- 
ent coordinate frames, can be collected. The collected data set can be used to solve for 
the unknown transformation MloLens Ty Tracker using Point Set Registration methods [10]. 

In order to ultimately be able to move the robot flange along user-defined path points, 
the robot location needs to be known with respect to the LH system. The transforma- 
tion scheme is shown in Fig. 4a), the goal is to determine the unknown transformation 
between the robot base and the LH system BT ere A procedure called Robot- 
Lighthouse-referencing is proposed. A Tracker is initially attached to the robot flange, it 
is then moved to discrete locations according to a predefined robot program (as depicted 
in Fig. 4b)). At the same time, the LH tracking system records the locations of the 
Tracker. The resulting correspondences can be used to determine the unknown transfor- 
mation using Point Set Registration methods. After the referencing is done, the Tracker 
can be removed. To make this procedure more efficient, a quick changer system for the 
robot flange can be used. All of the steps in the Robot-Lighthouse-referencing could 
potentially be automated. 


VIVE 


Robot Lighthouse Robot 


VIVE 
Lighthouse VIVE 
Tracker 


Fig.4 Robot-Lighthouse-referencing: a Transformation scheme, b Robot movement scheme 
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5 Evaluation 


To evaluate the system a basic demonstration is conducted in Sect. 5.1. An in-depth 
examination of the system accuracy is presented in Sect. 5.2 


5.1 Demonstration 


To evaluate the ARRPS platform, a basic functional demonstration has been conducted. 
The essential feature set to prove the feasibility are basic tools for robot path creation 
and editing using the controller. 

This feature set has been implemented and shown to work reliably. The creation of a 
robot path is depicted in Fig. 7. To program a path point, the user places the lower tip of 
the controller to the desired location, where it gets created after pressing the trigger but- 
ton. By pressing the trackpad, the motion type can be altered (Point-to-Point or Linear). 
Multiple points are visually connected with lines, whose color indicates the motion type. 
Via the Menu button, the last point can be deleted. The Grip button saves the path to the 
storage. 

The result of the programming process is a list of path points stored on the server 
which could be transferred through the robot interface. However, this transfer has not 
been implemented. Instead, the robot path is entered manually in the robot teach pendant 
in order to validate it. Further and more complex functionality could be implemented; 
however, creating a functionally rich and mature HMI was not the goal of this work. 


5.2 Examination of the System Accuracy 


Subsequently, the accuracy of the system is evaluated. First, an error analysis is con- 
ducted to explain which errors are examined and to understand what components 
contribute to the overall errors. Afterwards, the augmentation error and the Robot-Light- 
house-referencing are examined separately. Finally, the overall end-to-end accuracy of 
the path point placement is examined because it is the system’s most relevant perfor- 
mance characteristic. 


Error Analysis 
First, the errors and transformations of the path point placement are presented. The path 
point placement error consists of the user error, the LH tracking error, the Robot-Lighthouse- 
referencing error and the robot error, as shown in Fig. 5. This leads to a deviation between 
the location intended by the user and the location the robot would actually move to. 

Note that the AR display is not playing any role in this, since the path point location 
is created directly based on the controller location and transformed into robot coordi- 
nates. The location displayed in AR is meant only as a visual assistance and represents 
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the approximate path point location. This is a deliberate design choice because it reduces 
the path point placement error considerably. Other designs could still be implemented 
with the platform, but should be expected to have a different accuracy. 

Even though it’s not used for path point placement, the error of the augmentation is 
still important because augmentations should be near the correct locations in order to 
be useful for user information and guidance. As shown in Fig. 6, the augmentation error 
comprises the tracking error, the HoloLens-Lighthouse-registration error as well as the 
HoloLens error. The HoloLens error includes all errors that come from the AR display, 
such as the internal tracking and display error. 


Augmentation Error 

The augmentation is evaluated with user-generated ground truth data. To acquire it, the 
user is asked to align a real controller with a virtual one that he can control via the AR 
system. This is repeated in five different locations across a flat, rectangular workspace 
of 0.55 m edge length. The LHs are located at the corners of a 4 x 4 m rectangle, at the 
center of which the workspace is located. The absolute error of the augmentation is 
measured as (15.6+ 10.3 mm) translationally and (1.0 1.1°) rotationally. As expected, 
the accuracy of the augmentations is not very high, which is the reason why they are not 
used for the actual path point placement. 
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Accuracy of the Robot-Lighthouse-referencing 

The Robot-Lighthouse-referencing is evaluated by adding a set of test points to the 
Robot-Lighthouse-referencing procedure and evaluating the registration error for this test 
set, using the transformation that was calculated using the original data set. All points 
are located within a cuboid of 0.55 x 0.55 x 0.45 m? volume. The absolute error of the 
test set is determined as (10.8 3.0 mm). These results are worse than the expectations 
before the experiments. The observed error is of random nature, as no axis-specific bias 
could be proven with statistical t-tests. The following steps have been taken to find out if 
they can reduce the error: Variation of position and number of LHs, covering of metallic 
surfaces, change of the axis configuration of the robot, as well as change of the rotations 
of the robot flange. However, the error could not be reduced. More research on this prob- 
lem is needed. 


End-to-end Accuracy of the Path Point Placement 

To examine the end-to-end accuracy of the path point placement, the following experi- 
ment is conducted. The user places a path point at one corner of a cuboid shaped, 3D 
printed workpiece that is mounted on a tripod. The placement is repeated 5 times per 
point, so as to rule out random user error at each location. Subsequently, the tip of a 
3D printed tool that holds a needle is moved to the programmed location. This setup is 
shown in Fig. 7. Both the position of the workpiece and the tool are measured using a 
Leica LTD 800 laser tracker. That is why the tool and the workpiece are constructed to 
be able to hold three laser tracker targets each. This is repeated 14 times within a cuboid 
shaped workspace of approximately 0.6 x 0.6 x 0.25 m? volume. 

Table 1 shows an overview of the errors that have been measured. The absolute value 
of the error is (10.7 +3.5 mm). There is a bias in the z-axis of 9.3 mm. A possible expla- 
nation for the bias is the random error of the Robot-Lighthouse-referencing, which was 
carried out once previous to the path point placement and thus would appear as a sys- 
tematic error in the subsequent experiment. Thus, like mentioned before, the referencing 
procedure should be revised. Overall, a mean absolute error of 10.7 mm is not as good as 
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Fig.7 (a) Path point placement in front of a workpiece, (b) Path point placement experiment 
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Table 1 Error of the path point placement 


| mean [mm] Standard Deviation [mm] 


Absolute value 


X-axis | 1.9 | 6.9 
y-axis | —2.3 | 3.6 | 7.6 
z-axis 193 [3.2 114.0 


was expected when designing the system; however, a large bias suggests that by fixing 
the suspected issue, the achievable accuracy should be substantially higher. 

Only a relatively small workspace was tested because the robot used in the experi- 
ments limited the workspace of the evaluation procedure. It is expected that the accuracy 
will be lower in bigger workspaces. 


5.3 Discussion 


To conclude, the mean absolute error of 10.7mm in a relatively small workspace 
is decent, especially when considering it is a low-cost system, but not as good as was 
expected when designing the system. On top of that, in real-world application the over- 
all error is expected to increase rather than decrease, due to perturbations and generally 
less ideal conditions, so robustness needs to be evaluated as well. Because of the deter- 
mined accuracy only industrial processes such as painting and handling could be pro- 
grammed using the presented system, since they usually have a compatible tolerance of 
the tool positioning. The presented stylus-based placement method is ideal for the quick 
robot program creation of small programs from scratch. However, development of more 
complex functionality, e.g. the commissioning of a robot program that was planned in a 
simulation software is also possible. At the same time, the applications need to be limited 
to medium-size workspaces like 1 m? which is likely a sensible trade-off between work- 
space size and accuracy for this particular setup. To sum up, the accuracy in medium-size 
workspaces will be quite high, if the problem in the referencing can be solved. 


6 Conclusion 


In this work, a flexible and cost-efficient input and tracking system based on the VIVE 
technology has been developed. Overall, the system is feasible as a flexible and low- 
cost ARRPS system. The end-to-end mean absolute error of the path point placement is 
10.7 mm. The VIVE controller integration allows for many interesting possibilities when 
designing the user experience. The software platform can be a starting point for other 
developers to implement their own ARRPS-related ideas. 
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In future work, the robot interface could be implemented, e.g. using ROS. The robust- 
ness of the Robot-Lighthouse-referencing should be improved, so as to improve overall 
accuracy. Also, the setup time of the Robot-Lighthouse-referencing should be reduced by 
fully automating the procedure. Furthermore, the system should be evaluated in terms of 
usability and ergonomics for the intended use-cases. 
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Abstract 


This paper analyses the usability of Augmented Reality (AR) in the commissioning 
and programming of industrial robots. Conducting two individual studies with a total 
of 31 participants we analysed the three dimensions of usability: effectiveness, effi- 
ciency, and user satisfaction by comparing our developed AR system with the con- 
ventional Teach-In programming method during the commissioning and modification 
of offline created robot programs. The results indicate that, while less accurate and 
hence effective, the AR system is more efficient and has a higher user satisfaction. 
Beyond that a posture analysis indicates that during a timeframe of 30 min the addi- 
tional weight of the AR device does not significantly worsen the posture of a worker. 
Complemented by the positive result of the System Usability Score (SUS) that rates 
the analysed AR system with a good usability, the overall results indicate that while 
still limited by its achievable accuracy AR is an intuitive medium to conduct robot 
programming and commissioning. 
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1 Introduction 


Augmented Reality (AR) is a novel technology with the ability to combine spatially 
mapped digital and real content in an interactive and multimodal interface [1]. As such 
AR can serve the role of a human-machine-interface (HMI) and is capable of enhancing 
the flexible skills of human workers in an industry 4.0 environment [2]. Offering a more 
intuitive approach to human robot collaboration, AR-based robot programming could be 
a potential alternative to conventional online and offline programming. 

In literature, a variety of different systems realising AR-assisted robot programming 
ranging from path point modification [3], trajectory planning [4], collision detection [5], 
and human-machine collaboration [6] have been developed. Especially the benefits of 
natural gesture based programming methods have shown a higher efficiency as well as a 
good user satisfaction when compared to conventional programming methods [7]. How- 
ever, a breakthrough of AR in the scope of industrial robot programming beyond the tier 
of a proof-of-concept solution has not been acquired yet. 

Limited by the available stable accuracy, scenarios beyond pick-and-place application 
[8] are difficult to industrialise. With recent advances like the introduction of additional 
equipment like three-dimensionally tracked styluses [9] or external LIDAR sensors [10] 
performance, throughput and accuracy can be increased. Nevertheless, AR is not yet 
powerful enough to be a standalone alternative robot programming method in high tier 
automation. 

Hence, we chose a different approach and do not view AR-assisted robot program- 
ming as an alternative but an enhancement of existing conventional robot programming 
methods. Especially in high tier automation industries where a combination of offline 
planning, programming, and an online commissioning and optimisation is characteristic, 
AR can smooth the transition between the two phases. 

With AR we can on one hand assist the worker in the shopfloor environment with 
additional simulative abilities. On the other hand, deviations can be detected and directly 
corrected by comparing the digital model to the real workstation, thus, creating a more 
accurate digital representation. Based on this motivation we developed a system, that uti- 
lizes the visual and interactive abilities of AR to harness the features of an offline created 
robot programming system inside a factory environment to work with programs on a real 
robot [11]. While this approach is generally feasible, more detailed work in the assess- 
ment of accuracy, efficiency, and satisfaction, i.e. usability [12], is necessary. 

In the following paragraphs, we will give a brief overview of our developed system, 
after which we will define the usability of a process and introduce different methods to 
measure the individual dimensions. In the end, we will present the results of two stud- 
ies and discuss the usability of our system in the scope of AR-assisted robot program- 
ming. 
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2 MiReP—Mixed Reality Programming 


We try to smooth the transition from the digital planning environment to the real work- 
station by utilising AR in the commissioning of an offline created robot program. We 
chose a modularised architecture as our basic design pattern to combine the functional 
range offline robot programming systems present with the multimodal interactive capa- 
bilities of AR. Adhering to the guidelines of the dependency inversion [13], we split our 
application in one core and six independent microservices. 

Figure 1 (left) shows the general design of our application. Each microservice is 
implemented as an interchangeable plugin that adheres to a standardised interface 
defined by the core. Utilising only the standardised functions of the interface to orches- 
trate the different plugins, each system component is enclosed in an independent shell. 
This does not only increase testability and opens the possibility of decentralised cloud 
computed systems, but it creates an inherent extensibility. If, for example, a new plugin 
for a different AR device is developed, it can be introduced to the system without the 
necessity to update the entire infrastructure. 

The current implementation, schematically displayed in Fig. 1 (right), utilises a 
Microsoft HoloLens 2, a Microsoft Controller, and the simulation system Process 
Simulate (PS). By accessing the API of PS, we cannot only import, modify, simulate, 
and export programs, but directly export the geometry and position of CAD elements 
from the digital model as well. In combination with the model tracking capabilities of 
the Vuforia engine [14], we can detect CAD elements in the real world and register our 
system accordingly. After detection a reference geometry is displayed, and the user is 
prompted to confirm the correctness of the registration as shown in Fig. 2 left. 

Hereafter, programs from accessible machines are automatically imported. The user 
then selects a program to work on in the AR path editor (Fig. 2 right). While the visu- 
alisation and interaction happen in the scope of the AR and input device, all calculations 
regarding reachability, movement simulation, or tool changes are done in the simulation 
system. When ready, the modified program is then re-exported to the associated robot. 
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Fig. 1 Schematic sketches of the general architecture (left) and the implementation (right) 
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Fig. 2 AR view of the MiReP system prior and post program optimisation 


3 Target Dimensions of Usability 


The usability of a system offers a general evaluation of its suitability in a specific use 
case and is generally defined by three dimensions [12]: 


e Effectiveness: “How complete and accurate a user achieves the defined goal.” 
e Efficiency: “The used resources relative to the achieved accuracy.” 
e Satisfaction: “The perception and reaction of a user due to system.” 


When reflected on the scope of the MiReP application, an HMI enabling a worker to 
analyse, modify, and evaluate robot programs of a real machine based on a digital simu- 
lation model, these generalised dimensions can be concretised. 

The effectiveness directly relates to the quality of the modified process executed by 
the real robot. It consists of both the available functionality—e.g., the modification of 
the pose of a path point, the correct evaluation of a collision—as well as the quality of 
corrective modification—e.g., the accuracy of the modification. The efficiency is defined 
by the amount of time and effort a user invests into reaching the aspired result. The user 
satisfaction is a more complex parameter, as it embodies multiple interdependent and 
highly individual parameters. It consists of parameters like mental, physical, and tempo- 
ral load, as well as frustration, effort, and the perceived performance. 


4 Methodology 


Measuring the usability of AR-assisted robot programming is not trivial. While param- 
eters like end-to-end accuracy or duration can be measured absolutely, parameters like 
effort are coupled with the individual user, the scenario, as well as the current envi- 
ronment. However, while an absolute scale is difficult to realise, a relative comparison 
between two processes can be made. Hence, we will compare the usability of AR in the 
commissioning of an offline created robot program with the conventional Teach-In. 
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One standardised questionnaire applicable is the NASA Task Load Index (NASA- 
TLX) [15]. The questionnaire shown in Fig. 3 consists of six questions each targeting a 
different category that together offer an assessment of the global workload users perceive 
during processing of their task. 

Based on the results of the different categories, a global workload index with a range 
of 0 to 100 can be calculated. The lower the value, the better the result. 

Especially during prolonged work, the ergonomics of a task are an important ele- 
ment in a healthy work environment. As HMIs like the HoloLens increase the strain on 
the neck of the user due to their weight, a detrimental effect on the posture is expected. 
Hence, another more specific analysis of working posture is necessary in addition to the 
NASA-TLX, as bad posture correlates with physical demand. 

The Ovako Working Posture Assessment System (OWAS) offers an objective method 
to analyse the posture of a human over a prolonged period [16]. A score between 1 and 4 
is calculated depending on the relative position of back, arms, legs, and the handled load. 
The scoreboard is depicted in Fig. 4. 

As an example, Fig. 5 shows a user in two different working postures. Regarding the 
scoreboard the left user has a bent back (2), both arms are below shoulder level (1), he 
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| Perfect Failure 
Very Low Very High Effort How hard did you have to work to accomplish 


your level of performance? 


Physical Demand How physically demanding was the task? L E) i l l |_| | l |_| III tj 


Very Low Very High 
Very Low Very High Frustration How insecure, discouraged, irritated, stressed, 
and annoyed wereyou? 
Temporal Demand How hurried or rushed was the pace of the task? | 
l LLII | I | III I 1 1 1 Very Low Very High 
Very Low Very High 


Fig.3 NASA-TLX questionnaire [15] 


l Action | Digit | 

| oh Bent = 

| Por awisted 13] 
Bent and twisted 


Both arms below shoulder 
Arms | One arm at or above shoulder _ 
Both arms at or above shoulder 
Sitting 

Standing on two straight legs 
Standing on one straight ley 


E egs 
| Standing on one bent leg 


Load 
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> 20 kg 


Fig. 4 Ovako Working posture Assessment System (OWAS) [16] 
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Fig.5 User during programming with AR (left) and Teach-In (right) 


Fig.6 Sketch of original and aspired contour in simulation (left); view in AR setup (right) 


squats (3) and handles a load below 10 kg (1), scoring a value of 2 which implies that 
corrective actions to improve the working posture are required in the near future. 

As each OWAS analysis does only represent one moment during task execution it has 
to be done over a prolonged period of time. An overall grade between 100 and 400 is cal- 
culated depending on the percentage of time the user stays in a bad posture. 

As absolute values accuracy and working time can be measured in a simple experi- 
mental setup, as depicted in Fig. 6. 

The user modifies an erroneous program (red) by adding and repositioning path points 
until a defined contour (green) is acquired. The thereby created program is then exported 
to a robot, which, armed with a pen, then draws the contour on paper. An assessment of 
the accuracy and working time can be made by measuring the offset of each path point 
as well as the time the user took to modify the program. 

An additional method to evaluate the usability of a system is the System Usability 
Scale (SUS) [17]. Utilising ten standardised questions, an absolute score between 0 and 
100 can be calculated. The according question are displayed in Table 1. 

Each question is to be answered with one of the following statements: “Strongly 
Agree”, “Agree”, “Neutral”, “Disagree” or “Strongly Disagree”. From that, a global 
score can be calculated. Generally, a value exceeding 70 indicates a good usability. 
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Table 1 System Usability Scale Questionnaire 


I think that I would like to use this system frequently 


I found the system unnecessarily complex 


I thought the system was easy to use 


I think that I would need the support of a technical person to use this system 


I found the various functions in this system were well integrated 


I thought there was too much inconsistency in this system 


I would imagine that most people would learn to use this system very quickly 


I found the system very cumbersome to use 


I felt very confident using the system 


= 
oO 


I needed to learn a lot of things before I could get going with the system 


5 Mock-Up and Conduction of the Experiment 


We conducted two independent studies with a total of 31 participants. Each time a user 
commissioned an offline created robot program with both the AR-assisted as well as the 
Teach-In method. In the first study, we used the OWAS to assess the working posture of 
ten users with an age between 19 and 54. 

Figure 7 shows the original program in the simulation system as well as the displaced 
program as viewed in AR. The users were split in two separate groups. After a brief 
10-min introduction to either the MiReP system or Teach-In programming, the user com- 
missioned for 30-min. The same procedure was repeated with the other programming 
method after a short break. The order changed depending on the group affiliation. Bach 
user was recorded with a camera during execution. The average risk index of the MiReP 
system is 109/400, Teach-In programming has a value of 104/400. 

Similar to the previous experiment, two groups of users with a total of 21 participants 
with an age between 17 and 35 commissioned an erroneous offline created robot pro- 
gram in different orders. A sheet of paper indicated the aspired contour as a guideline for 
the optimisation. 

Figure 8 shows a user during the two different tasks. In preparation to the task, each 
user was given 10 to 20 min of guided preparation with each of the two systems. During 
training, 5 of the handled pens were broken due to programming errors while controlling 
the robot with the Teach-In method. 

During the experiment, no pens were broken. However, some user needed additional 
assistance while using the Teach-In programming due to operating issues. 

In the end, any created program with either method was valid and runnable. The cal- 
culated results regarding accuracy and working time are shown in Fig. 9. 

Users filled out aNASA-TLX questionnaire immediately after completing a program- 
ming task. The averaged results are displayed in Fig. 10. 
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Fig.8 User during 
programming with Teach-In 
(left) and AR (right) 
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Fig.10 Results of the NASA-TLX 
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The average global task load of AR is 29 with a standard deviation of 13.4, whereas 
the Teach-In method averaged at 34.9 with a standard deviation of 13.4. 

At the end of the experiment, each user filled out a SUS questionnaire. The averaged 
result was 75 with a standard deviation of 12. 


6 Discussion 


Both calculated OWAS scores are acceptable. Even though MiReP scored slightly worse 
(109/400) we assume, that using AR for a 30-min commissioning does not significantly 
worsen the posture of a user when compared to the Teach-In method. 

As expected, the result of the accuracy assessment shows that the average error of 
9.7 mm with a standard deviation of 6 when using AR is significantly worse than the 
average error using Teach-In which is 1 mm with a standard deviation of 0.7 mm. 

However, as depicted in Fig. 11 the potential influence of a systematic error can be 
detected. After calculating the systematic error and adjusting the result an average error 
of 2.9 mm is calculated confirming the existence of a systematic effect. 

As users register the device initially from an individually chosen angle and position, 
especially in consideration that AR glasses have known limitations regarding depth per- 
ception [18], we assume this to play a major part in the systematic error. However, as the 
best user achieved a fairly high level of accuracy (2.8 mm), this also shows that a proper 
setup can result in higher achieved program quality. In addition to accuracy, Fig. 9 also 
shows a 32% reduction in programming time when compared to Teach-In. Using a two- 
sample t-test a significant difference is deduced. 

The results of the NASA-TLX show that, even though not significant, the global task 
load of AR is slightly better than Teach-In programming. However, it is noticeable that 
the physical load perceived when working with AR is lower than when using the Teach- 
In. While this is contradictory to the results of the OWAS, it can be partly explained by 
the shorter working time and the lower effort needed to reach the aspired goal. Moreover, 
the grading of the perceived performance correlates with the measured accuracy. 


Fig. 11 Comparison of best (avg. error 2.8 mm) to worst (avg. error 25.4 mm) result with AR 
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Even though the sample size of 21 is small, the results of the SUS indicate a generally 
good usability of the presented AR-assisted robot programming. 


7 Conclusion and Outlook 


The presented studies show that when utilising AR-assisted robot programming the com- 
missioning of an offline created robot program is more efficient but less accurate when 
compared to the Teach-In robot programming. It was shown that the initial registration 
has a major effect on the overall error, hence, the introduction of additional visual assis- 
tance and continuous feedback to the user could improve the performance of AR-assisted 
robot programming. The OWAS showed that during a task duration of 30 min the use of 
an AR device does not negatively impact the posture of a user. This is confirmed by the 
NASA-TLX that shows a slightly better global workload than in Teach-In programming. 
Complemented by the results of the SUS, the presented AR-assisted robot programming 
in the commissioning of offline created robot programs has a generally good usability. 

The results show that the presented AR-assisted robot programming is currently not 
accurate enough to fully substitute the Teach-In programming in commissioning of 
offline created robot programs. However, due to its intuitiveness it is plausible to use AR 
when either the accuracy suffices the depicted use case or if AR is utilised as a transition 
to reduce the necessary amount of Teach-In optimisation to reduce the overall duration 
of commissioning. 
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Abstract 


The commissioning of robot cells requires an individual safety analysis followed 
by an appropriate dimensioning of safety components to ensure worker safety. 
With increasing complexity of robot work cells, developing proper safety concepts 
becomes more challenging. Therefore, updated validation concepts are needed that 
support safety engineers and reduce delay due to commissioning. This paper presents 
a solution to decrease the commissioning time for robot cells due to safety consid- 
erations and implementation of measurements. Data from a digital twin (DT) is used 
to generate test programs that are able to automatically measure relevant safety dis- 
tances. The virtual robot cell is used to generate robot paths for the real cell. A dis- 
tance sensor (laser based) measures the distance to relevant objects within the real 
cell. The programs run automatically, and the safety engineer only defines safety rel- 
evant points within the DT. Furter on, augmented reality (AR) is used to visualize 
safety zones specific to the induvial generated safety concept. 
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1 Introduction 


Safety analysis for industrial robot cells is a crucial step during commissioning [1]. 
Every foreseeable potential hazard needs to be evaluated and minimized to ensure 
machine operator safety. Due to enhanced robot controller features new opportunities for 
the use of robot cells become more and more possible. This involves layout and task 
specific details, like work objects, grippers and handled objects [2, 3]. Since all these 
factors determine the safety concept, the demand for easy-to-handle safety systems with 
short commissioning times is further increasing. More versatile assembly or production 
systems need permanent update of the safety documentation [4]. Safety calculations are 
mostly based on minimal permitted distances between robot and objects or safety zones 
within the robot cell. Usually, safety concepts are derived based on assumptions for the 
position of robot and human and their respective maximum velocities. In practice mini- 
mal deviations of objects and safety equipment within the real cell can lead to a threat to 
the worker. 

State of the art robot controllers enable complex configurations regarding safety func- 
tions. ABB’s SafeMove Pro feature, for example, allows for a case specific reduction of 
robot velocity, restricted working spaces by using safety zones and limiting the range 
of motion. Some safety zones only exist within the software, which makes them harder 
to evaluate within the real cell (Fig. 1). To compensate the lack of physical separating 
safety components, more flexible solutions like light curtains or laser scanners are used. 
An individual testing of the certain component may be easy to accomplish, but an evalu- 
ation of the overall safety concept is more complicated. 

Deviations in the minimal permitted spatial configuration of objects can lead to an 
insufficient safety concept. A proper validation concept needs to support the safety engi- 
neer to evaluate spatially complex safety zones. The following solutions are developed 
and presented in this work: 


„= u = LF < 


Fig.1 Example of a safety configuration for ABB robots (different colors mean different safety 
zones that are activated by the operator) 
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e Automated visualization of safety zones and areas based on the specific safety con- 
cept. 

e Solution to compare DT to the real cell. 

e Robots as reference system. 

e Data from DT is usable to generate test program. 


1.1 Safety Concepts for Robot Cells 


For a variety of assembly or productions systems different approaches for safety con- 
cepts were developed. The higher demand for individualized products calls for produc- 
tion lines to get more and more close to lot-size 1 [5]. A well-known concepts behind 
this demand is the reconfigurable manufacturing system (RMS) [6]. 

Separating protective devices are widely used as a solution to ensure machine safety. 
Unfortunately, those systems lack flexibility while producing high costs if a change in 
cell design is needed. As an alternative, sensor-based security components such as laser 
scanners or light barriers are established. An example of a cell with different safety com- 
ponents is presented in Fig. 2. 

To ensure the exact position of every safety component and therefore the safety of the 
whole cell a safety certification must be conducted. To evaluate the risk and validate and ver- 
ify the cell’s safety standards such as EN ISO 10218-1 [8] and EN ISO 10218-2 [9] are used. 


Fig. 2 Safety components of a robot cell [7] 
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2 Materials and Methods 


As mentioned before the main goal of this work is reducing the validation time of safety 
concepts for industrial robot cells. One major part of every safety concept is to examine 
the distances between safety relevant objects, especially workers, and the robot. Men- 
tioned in German national standards [8, 9] and international standards [10, 11] safety 
engineers must examine these aspects. The presented software VISIBLE generates an 
automated solution for this process. Based on a DT of the robot cell distanced between 
objects can easily be derived. This process generates the setpoints for the distances that 
now must be evaluated within the real system. A laser distance sensor mounted on the 
robot is used to examine the correct positioning of objects within the cell. The VISIBLE 
software generates a tool path planning for the laser distance sensor to properly measure 
the desired distances. 

This method is designed to assist the safety engineer while commissioning production 
systems according to the mentioned standards. 

Additionally, AR is used to visualize an overlap of real and virtual objects. The safety 
engineer uses AR-glasses (Microsoft HoloLens) to project the robot’s virtual safety 
zones while physically standing in the cell. A calibration of the AR-environment is per- 
formed guided by an assistant that automatically calculates the accuracy so that the user 
can properly calibrate the AR-glasses. The AR-glasses recognize the position of a cali- 
bration marker and align its coordinate system relative to the robot. A value benefit anal- 
ysis was carried out comparing several different components such as: 


e accuracy 
e time consumption 

e necessary knowledge to use the components 
e error-proneness 

e failure probability 

e repeatability 

e safety within the process 

e projection of safety zone 

e calibration 

e documentation 


The analysis resulted in AR-glasses and a distance measuring device being the most suit- 
able solution. The used hardware is shown in Fig. 3. For the AR application a HoloLens 
and the HoloLens Development Edition by Microsoft is used. For distance measurement 
a GLM 120 C Professional by Bosch is used. 
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Fig.3 Used hardware: distance measuring device (e.g. line laser) (left), AR-glasses HoloLens 
(right) 


The whole concept is manufacturer independent. New installations and changes are 
easy to implement. The new methods of visualization can project even complex contours 
such as planes at different height levels. The new system links solutions for robotic and 
AR and decreases the barriers to entry for advanced industrial robot cells. 


3 Results 


The following section presents the results of this work. It is structured in different parts 
each regarding different aspects of the solution. 


3.1 Development of a Method to Verify Safety Components 


Initially literature research was conducted to get an overview of the state-of-the-art 
approaches to verify safety components. There are several norms and guidelines such as 
the EN ISO 10218, EN ISO 12100, EN ISO 13849 and several more that the approach is 
built on. The norms list different methods for planning, design, risk evaluation, verifica- 
tion, validation and documentation of safety components. The state-of-the-art sequence 
of the necessary tasks is presented in Fig. 4. 

After evaluation of the established sequence for plant commissioning a new method is 
developed. The implementation of the new method includes a comparison of the layout 
of the cell and the information of the DT. The process enables an automated documenta- 
tion by picking up contours of physical and optical safety components from the planning 
tool in relation to the robot basis coordinate system. This feature is included in the new 
developed sequence presented in Fig. 5. 

Further on the use of the HoloLens offers a solution to examine virtual safety zones. 
Zones that only exist within the robot control lack a physical counterpart in the real cell 
that makes them hard to evaluate. Visualizing these zones using AR offers an easy-to- 
handle solution for the safety engineer. 
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Fig.5 Method for automated commissioning of robot cells 


The planning and verification of safety components are to be connected to the devel- 
oped software. An analysis of robot safety controls and their interfaces was carried out 
identifying those controls that are able to integrate safety zones. The controls are: 
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e ABB SafeMove2 

e Denso Safety Motion 

e Fanuc Dual Check Safety 

e Kuka Kuka.SafeOperation 

e Staubli CS9 

e Yaskawa/Motoman Functional Safety Unit 


The safety controllers of the above-mentioned manufacturers were also evaluated if they 
can connect measuring or visualization devices. The controls of ABB and KUKA fulfill 
the requirements the most. 


3.2 Development of the Planning, Communication 
and Verification Software 


The overall approach for a software architecture is presented in Fig. 6. The concept 
illustrates the interaction of planning, communication and validation functions. After a 
detailed analysis the methods of a “line laser” and “laser projection” were evaluated as 
inappropriate. 
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Fig.6 Architecture of the software for planning, communication, and verification 
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The developed tool converts the parametrized safety zones from the robot controller 
into mesh-geometry for further use. The converted objects are easily visualized via the 
AR-glasses. The desired and the actual state can be displayed. 

To calibrate the AR-glasses within in the robot cell a marker and an assistant were 
developed. An iterative algorithm calculates the position of the glasses relative to 
the robot position based on predefined points. The accuracy of the calibration process 
increases with every calibration step. A stop criterion is derived that stops the iteration 
if the accuracy does not increase any further. This process is a state-of-the-art process 
for calibrating AR-glasses. For the given scenario several different marker positions were 
examined by the principle of trial-and-error. 

Both simple measuring points and complex contours can be defined within the soft- 
ware. The measurement of these points and contours can be performed automatically 
using the software by direct control of the laser measuring device and the robot control- 
ler. After the measurement, the results are automatically summarized in a report. Figure 7 
shows the user interface of the planning software. 

In order to be able to move the robot to the defined points or to have the robot move 
along the defined contours, a system was introduced that does not require inverse kin- 
ematics. The system is based on a rough pre-positioning of the robot by the commis- 
sioning engineer. The pre-positioning can then be used to calculate the final poses for 
the defined points and contours. Subsequently the check points and their distances to the 
robot as well as any deviations between measured (in real cell) and calculated (in virtual 
cell) distances are summarized in a technical report. 


Fig. 7 User interface of the planning software 
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3.3 Development of a Modular Measuring- and Visualization 
System 


Prior to using the planning software two calibration processes are needed. First the oper- 
ating point of the distance measuring device must be described within the robot basis 
coordinate system. Second a connection between the coordinate system and the AR- 
glasses must be established. This leads to three functional components that must be 
mounted on the robot: 


e A device for travel time measurement to perform measurements on the real robot cell. 
e A calibration marker to link the coordinate of the AR-glasses and the robot basis. 
e Two test prods to calibrate the robot tool. 


The components are mounted on a plate with standardized hole pattern to mount it 
directly on the robot flange. The modular setup allows a case specific mounting of the 
components on the tool. Figure 8 shows the tool. 

The accuracy of this method for calibration was tested using an ABB IRB 4600. In 
case of the distance measurement device calibration, an angular error between 0.09 and 
0.2 degree between the measured robot tool and the laser beam was measured. For more 
acute angles between laser beam and surface an increase of the angular error is observed. 
The error for the alignment of the light point (distance of 1000 mm, beam angle of 90° 
to the surface) is less than 4 mm in an unforeseeable direction. Therefore, check points 
must have a minimum distance to corners and edges to ensure that the desired object is 
hit. The precision of the test can be further increased by using a more accurate robot. 


Fig.8 Modular robot tool 
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To calibrate the connection between robot basis and AR-glasses the calibration 
marker is used at different calibration positions. For every position the connection 
between both systems is known and therefore the pose of the robot coordinate system 
and the AR-glasses system can be derived. The highest accuracy was achieved, when 
the marker was spectated (by the AR-glasses) along an arc of 90°. This method allows 
a visualization with an accuracy of 5 mm between virtual safety zone from the DT and 
visualized safety zone in the augmented reality. 


4 Summary and Conclusion 


A new system for a safety analysis of robot cells was presented. Usually the process of 
safety analysis is very time consuming while commissioning new robot cells and has 
great potential for automation. A software was introduced that partly automated chooses 
points within the digital twin of a robot cell. The software then calculates poses for the 
robot in which the distance of the checkpoints from the virtual model can be measured in 
the real work cell. Within the performed tests an accuracy of 5 mm was achieved. How- 
ever for greater distances the angular error further increases. 

The increasing computing power of robot controllers allows to use more and more 
complex virtual safety zones within the safety setup. This makes it harder for the engi- 
neers to evaluate safety concept. Therefore, AR-Glasses were used to display virtual 
safety zones in the real cell. To calibrate the system, a calibration method to connect the 
coordinate system of the AR-Glasses to the system of the robot was developed using a 
calibration marker. This method allows a display of the holograms of the safety zones 
with an overall accuracy of 5 mm. 

The presented system contributes to the aim of decreasing commissioning time and 
assists the safety engineer to evaluate more and more complex safety concepts. 
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Abstract 


Beside the advancing trends in automation especially in regard to Industry 4.0, work- 
ers in industrial factories in non-automated work activities often face repetitive tasks 
with heavy workloads. Whenever methods or adaptions in both technology and organ- 
ization are insufficient to optimize working conditions, personal-related interventions 
as exoskeletons come into question. They may prove successful in alleviating muscu- 
loskeletal disorders and relieving physical strain. The increasing number of exoskel- 
etons often challenges users or companies to select or specify an appropriate specific 
system for their applications. In order to address this problem, this paper presents 
the possibility for developers of using a digital twin for evaluating particular support 
characteristics of exoskeletons at an early stage of product development. The process 
for a user-specific design is strongly dependent on the activity and its environment. 
As a use case for the validation of a digital twin, an overhead work activity is ana- 
lyzed and relevant factors such as muscle activity are examined in this paper. Initial 
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simulation results show promising possibilities for parameter variation of different 
properties of an industrial work process in order to create a starting point for a future 
developing of an optimally tailored upper body exoskeleton. 


Keywords 


Exoskeletons - Digital Human Model simulation - Biomechanical analysis - Human- 
Machine Interaction 


1 Introduction 


Nowadays, advancing technology in the industry such as automation and mechaniza- 
tion affect the redesign of workplaces, as well as the implementation of new systems 
in industrial applications (e.g., systems for human-machine cooperation [1], exoskele- 
tons [2], or augmented reality systems [3]). Demographic change forces employers to 
provide more technical assistance systems to reduce musculoskeletal loads and enable a 
longer, healthier and safer working life [4, 12]. More than one of three people manipu- 
lates heavyweight goods during the workday, 43 percent daily work in tiring, exhausting, 
or painful postures [4]. As a result, workers are physically burdened and exposed to a 
risk of developing musculoskeletal disorders (MSD) [4]. Studies of [5] complain about 
21.6 billion Euro loss of gross value due to incapacity to work days caused by MSD. 
Forecasts define a worldwide market volume of up to 5.6 billion dollars in 2025 for the 
exoskeleton industry, where especially work-assisting devices will grow exponentially 
[6]. Upper body exoskeletons will take on a primary role for possible future solutions 
for specific work tasks like lifting heavy parts and for overhead working tasks. Regard- 
ing industrial applications, exoskeletons are externally wearable mechanical devices [7] 
that either empower, facilitate, stabilize, or add movements [8]. Support systems such as 
exoskeletons are used with the aim of reducing strain on workers without having to make 
extensive interventions in the work process flow [8]. An ergonomic design of the work 
process can reduce the development of musculoskeletal disorders but can also become an 
economic challenge in the case of significant process and product changes [9]. In prac- 
tice it is difficult and costly to prove the exact effectiveness of exoskeletons for support- 
ing specific work activities. Laboratory and field studies are conducted for this purpose, 
but they require a great amount of time and expense in product development [10]. Cur- 
rently, there are no exoskeletons on the market that can be manufactured according to 
variable parameters and specified boundary conditions. Users cannot find a suitable sys- 
tem that satisfactorily addresses their individual requirements - for example, movements 
are restricted, kinematic structures do not match movement patterns or interfaces are 
uncomfortable [10]. The process for a user- and task-specific new design of exoskeletons 
is influenced by various aspects. The technically complex replication of joints (e.g., the 
shoulder joint with several degrees of freedom) is always a challenge in system devel- 
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opment, in order to ensure that there are no movement restrictions for the user. There 
is a demand how to validate and optimize exoskeletons for the applied task in respect 
to movements and loads. End-users of exoskeletons vary in population characteristics 
such as anthropometry, muscle strength, body mass, manner of executing movements, 
and each application scenario varies concerning movement and load-specific boundary 
conditions. Digital human models comprising the human as well as the exoskeleton in a 
single biomechanical system offer the chance to consider all these aspects in parallel [6]. 


2 Evaluation of the Biomechanical Requirements for the 
Digital Twin 


The detailed evaluation of the workplace and the working environment as well the choice 
of the right biomechanical parameters to improve is of great importance in the first step 
for a creation of a digital twin. 


2.1 Analysis of the Workflow and the Specific Environment 


Typical human activities in e.g. the automotive industry, aircraft production, logistics, 
retail, are e.g. handling loads, performing tasks at head height or above or assembling 
very small products. These and other tasks lead to different strains (e.g. with regard to 
the body region,). The analysis of such tasks is important, as it defines the starting point 
for interactions between human and technology in order to improve the quality of work 
and to relieve employees. Four main activities in particular need to be distinguished in 
industrial context: Lifting and carrying, working at and above head height, pushing and 
pulling or drilling and screwing. Depending on the activity, different parts of the body 
are stressed to different degrees [12]. On the basis of the identified task, various distin- 
guishing characteristics can be derived for activities in industrial production and must be 
taken into account by introducing a exoskeletal system in the workflow of a company: 


e Dynamic and static activities: Distinction with regard to the speed of movement. On 
the one side, highly dynamic activities and on the other side, activities in static posi- 
tions are distinguished [12]. 

e Variance of tasks within the activity: For example, rotating workplaces with standing and 
sitting parts, and tasks at and above head height as opposed to monotonous work. The 
characteristics are the extent or ratio of secondary activities within a main activity [5]. 

e Activities with and without components/tools: Depending on the activities, different 
objects or tools have to be used, whose handling and thus the regional stress can dif- 
fer significantly (e.g. screwing with a screwdriver or screwing with a cordless screw- 
driver). 
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e Weights to be handled: Differentiation with regard to the weights to be handled of 
parts, components, assemblies, workpieces or tools. The weights can vary from a few 
grams up to 30 kg. 

e Process forces/interaction forces: Amount of forces acting on the body during an 
activity. This does not include tool/component weights, but forces that act due to the 
work process, such as torques or contact pressure when assembling a component. dur- 
ing the assembly of a component. 

e Range of movement: Activities are not only carried out with different postures, but 
also in different ranges of movement. This is particularly important for occupa- 
tional safety (e.g. collision with work equipment, falls) but also for the assessment of 
stresses, since small ranges of movement (e.g. assembly locations that are difficult to 
access) cause more forced postures. 


2.2 Biomechanical Analysis 


In order to determine the requirements and the exact need for the support of an exoskel- 
etal system, the movement sequence of the workflow must be analyzed biomechanically 
beforehand. For this purpose, it is crucial which parameters can be examined. Biome- 
chanical analysis usually includes different aspects of the interaction of the human with 
its environment e.g. the movement (kinematics), external forces and moments (kinetics) 
acting on the body or caused by its interaction with the environment, internal forces and 
also the muscle activity that cause voluntary body movement. The following parameters 
summarizes a selection of the most important values used for the evaluation of biome- 
chanical effects for a physical support system — exoskeleton [11]. 


e Body movement: Specific motion patterns such as joint angles, trajectories, dynamics 
(velocities, accelerations) with and without support. 

e Muscle activity: Muscle activity in percent of maximum muscle activity (MVC, 
maximum voluntary contraction). 

e Cardiovascular/cardiopulmonary activity/metabolic effort: Heart rate, O,/CO, rela- 
tion, relative VO, in ml/min/kg. 

e Force/torque (inertial, external): Joint reactions forces/torques, ground reaction 
forces, balance, center of pressure. 

e Individual perception: Comfort level, user acceptance, psychological aspects. 


The most used biomechanical analysis in the evaluation of physical support systems 
(exoskeletons) of all listed methods in this section is the electromyographical analysis 
(EMG) [14]. This reference value is created by maximum voluntary contraction (MVC) 
measurements directly prior to the actual measurements. The general idea behind MVC 
measurements is that these will present themselves as a 100% contraction value. Any 
measurements will be presented as a percentage value in reference to the MVC [14]. 
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2.3 Digital Human Model 


Basically, the implementation of a digital twin requires suitable tools and a digital envi- 
ronment that can be configured. In order to be able to carry out dynamic biomechani- 
cal analyses for the prediction of relevant biomechanical parameters (e.g. internal forces, 
torque, muscle activities) acting on the human body Digital Human Model (DHM) are 
required. DHM software is a computer-aided design tool [15]. It can be evaluated from 
an ergonomics perspective using virtual simulation before making the real physical pro- 
totype. A few popular DHM software, which is commercially available include JACK, 
Sammie, Ramsis, Open Sim and the Biomechanics of Bodies (BoB) [10]. Some impor- 
tant previous research work [15] was conducted with a biomechanical modelling system, 
namely, the AnyBody Modeling System (AMS). This system provides the possibility to 
investigate the interaction of biomechanics at the musculoskeletal level. In such muscu- 
loskeletal models, structures like bones, tendons or muscles are modelled very detailed. 
AMS offers the possibility to simulate these models in interaction with its environment 
and to perform an inverse kinematic analysis. 

To perform a simulation, motions and reactions normally have to be recorded from a 
subject (human) and transferred to the DHM. For recording movements, usually motion 
capturing are used. The entire human movement is captured with the help of a limited 
number of markers on the body via optical, three-dimensional kinematic camera [14]. 
Afterwards the position and orientation of the markers is transferred to the DHM with 
inverse kinematics to perform the simulation. 

For this paper AMS was chosen because of the possibility to model additional 
mechanical variables (exoskeletal effect) in the simulation environment. 


2.4 Process for the Prototypical Implementation of a Digital 
Twin 


e Step 1 — Analysis of the biomechanical need for physical support: According 
Sect. 2.1 the analysis of the relevant task(s) in the workflow must be carried out and 
pre-analyzed in detail to exclude the possibility that important factors are not taken 
into account. 

e Step 2 - Modeling of activities: After deriving the work activity to be investigated, 
the motion sequence must be recorded in detail using motion capture. 

e Step 3 - Consideration of existing exoskeletal systems: As a starting point for the 
developers, exoskeletons that are already available on the market should be analyzed 
for the application case. Already realized designs can be verified regarding their bio- 
mechanical effect. This effect can be incorporated in the simulation as a starting point 
for a possible physical support system that must be optimized. 

e Step 4 - Creation of the DHM: After selecting the appropriate DHM as described in 
Sect. 2.3, the digital twin must be parameterized according to size and weight of the 
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users. The movement must be imported from the motion capturing and validated for 
further simulation. 

e Step 5 - Applying load and support force in the digital human model: The model- 
ling of the mechanical parameters of the exoskeleton (in the form of external forces or 
moments with defined points of application on the human body) must be implemented 
in the simulation. For this virtual effect there is no need of a CAD model of the exo- 
skeleton. 

e Step 6 — Setting different support characteristics: Input variables (forces, 
moments) defined in step 5 must be varied (parameter study) accordingly in order to 
define an optimal configuration of the physical support for the user. Other external 
variables (environmental parameters) must also be implemented in this step. 

e Step 7 - Evaluating the relevant biomechanical parameters — parameter study: 
Biomechanical parameters (output variables) must be evaluated (inverse dynamic 
analysis). Possible anomalies and correlations should be investigated - the optimal 
physical support for the specific user requirements must be identified. 


3 Application Example: Overhead Lifting Task 


To illustrate the application of a digital twin for an exoskeletal system evaluation, an 
example with corresponding parameter study has been provided in this paper. The aim 
of this evaluation is to determine the different effects of weight and support changes on 
muscle activity for the defined movement sequence. A common overhead work activity 
from the industrial sector was selected as a use case: Overhead (right arm) lifting activ- 
ity. To investigate the effect of exoskeletal support when lifting a variable load, a model 
was built in the AMS to replicate the characteristics of the exoskeleton Lucy [9] as a 
starting point. Lucy is a shoulder exoskeleton and supports the abduction/elevation of the 
upper arms by means of a pneumatic actuator depending on the angle between the upper 
arm and the upper body (upper arm elevation angle) [9]. The model created in AMS gen- 
erates a torque in the right shoulder depending on the angle between the humerus and 
thorax, which acts upwards. It therefore supports the lifting of the arm forwards (ante- 
version) and sideways (abduction). The following figure shows the movement sequence 
called the humerus-thorax elevation. This movement was recorded according step 2 in 
Sect. 2.4 with motion capture (Fig. 1). 

The subject stands upright at the beginning of the movement. The left arm hangs 
freely downwards. The muscles of the left arm are not examined in this study, which 
is why the movement of the left arm during the drilling activity is not described further 
here. The right arm is already pointing forward at the start of the movement. During the 
motion capturing recording, the test person held the drilling tool in his right hand. This 
is not shown in the simulation. The mass of the tool is changed in the different scenar- 
ios and acts at the center of the right palm. The test person moves the right arm evenly 
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0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 


Fig. 1 Movement sequence of the overhead work activity (displayed in AMS) 


upwards until the tip of the tool reaches the point of action. In a realistic drilling process, 
the subject would now increase the pressure to perform the drilling. This detail is not 
simulated in this simulation for reasons of clarity and simplicity. Since different masses 
are defined as tool weight in the different scenarios, the same effects should be shown 
this way. This use case is limited to a consideration of muscle activities. Muscle activity 
is defined as the active state of the muscle in fractions of the maximum voluntary con- 
traction. This means that at a muscle activity of 100%, the muscle has reached its theo- 
retical load limit. A value greater than 100% is not possible in practice, but can occur in 
the simulation [13]. The muscle activities are measured as a representative parameter for 
measuring the relieving effect of the exoskeleton on the human musculoskeletal system. 
The maximum of the average muscle activity gives information about the unevenness 
of the effort. The further away the maximum is from the average mean value, the more 
irregular the load. 

The parameter study is based on several scenarios, in each of them one parameter is 
changed and a comparison is generated. In the first comparison, two simulations are car- 
ried out in which the movement is simulated without exoskeletal support. The load in the 
first simulation is 0 kg and in the second simulation it is 2 kg. The simulation without 
load corresponds to the simple lifting of the right arm. These two scenarios are simu- 
lated, among other reasons, in order to obtain a reference for the results and to carry out 
a kind of plausibility check of the simulation. The assumption that muscle activity will 
increase with increasing load could be plausibly proven. In order to analyze the influ- 
ence of the load, simulations with 4, 6 and 10 kg will be carried out sequentially. In the 
second comparison, the movement is simulated with the support of the assisting torque. 
The implemented torque curves (virtual torque applied to the shoulder hinge) are based 
on the exoskeleton Lucy and are shown in Fig. 2. The theoretical maximum assistance 
power for the exoskeleton Lucy is 12 Nm at a humerus-thorax angle of 90 degrees [9]. 
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Support torque [Nm] 


0 50 100 150 200 
Elevation: Angle between h umenus (right) and thorax [deg] 


c.c... T(max): 6 Nm —--—- T(max): 12 Nm 


Fig. 2 Support torque with maximum peak at 90 degree 


3.1 Results 


In this use-case, only the most straining muscles in the shoulder were taken into con- 
sideration. For this purpose Deltoideus (anterior), Supraspinatus and Infraspinatus were 
analyzed in the situations. For a better understanding the structure of the shoulder mus- 
cles is illustrated in Fig. 3 (right side). It can be easily seen in Fig. 3 (left side) that in 
the part of the cycle where the applied torque is high, the effect on muscle activity is 
also high. It can also be seen that the effect on the Deltoid muscle is greater than on the 
Infraspinatus muscle. Especially the last third of the cycle, the exoskeletal support does 
not have such a large effect on the muscle activity of these two muscles. 

The results of all the different scenarios were summarized in Figs. 4 and 5. For this 
purpose, they are displayed in boxplots. The minimum, the lower quartile, the median, 
the upper quartile and the maximum are shown. For each muscle, a boxplot was created 
with one box per simulation scenario. 


Deltoideus (cut) 


Muscle activity (% MVC) 


20% 30% 40% 50% 60% 


Movment Cycle 
Deltoideus 0 Nm — ~ Supraspinatus 0 Nm Infraspinatus 0 Nm 


Deltoideus 12 Nm Supraspinatus 12 + Infraspinatus 12 Nm 


Fig.3 Left: Muscle activities under varying support (torque) with 2 kg load, Right: Anatomy of 
the shoulder 
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Fig.4 Muscle activity of the Supraspinatus (left) and Infraspinatus (right) under varying support 
and load 


The main task of the Supraspinatus muscle in this use case is abduction and external 
rotation of the upper arm, especially below an abduction angle of 15°. In Fig. 4 (left) 
the second box (0 Nm support, 2 kg load) and the fourth box (12 Nm, 2 kg load) are 
compared, it can be seen that the torque support has a clearly positive effect on muscle 
activity. The last three boxes (right in the diagram) show that an increase in load also 
increases the scatter of the data. 

The Infraspinatus muscle is mainly responsible for the external rotation of the upper 
arm. The exoskeletal support has a different influence here than on the Supraspinatus 
muscle. It is noticeable in Fig. 4 (right) that in the second to fourth box (from the left 
side of the diagram) the upper quartile and the maxima are very close together. Here it is 
mainly the interquartile range that changes. This means that there is definitely a relief of 
the Infraspinatus through the exoskeletal support. 

The Deltoideus Anterior muscle is largely involved in lifting the arm forward and is 
therefore the most important muscle in this activity. The greatest effect on muscle activity 
is therefore expected. This can also be easily seen in the 2nd to 4th box in Fig. 5 (from the 
left side). With successive increases in torque, all values decrease constantly. Increasing the 
load results in clearly higher muscle activity, whereby the scatter of the data increases also. 
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3.2 Discussion of the Results 


In general, the simulation results show that a higher supporting torque leads to lower 
muscle activities in the Deltoideus Anterior, Infraspinatus and Supraspinatus. The lowest 
muscle activities were seen at lower loads and at greater supports. An important observa- 
tion is that for this movement in this simulation, the increase in average muscle activity is 
approximately linear with a constant increase in load. However, it is not possible to predict 
whether this increase remains similar with further increases in load. The simulation is par- 
ticularly suitable for looking at the muscles individually, different effects can be observed 
for the individual muscles. For example, in the Supraspinatus, the dispersion of the data 
increases when the load is increased. With increased torque, on the other hand, the disper- 
sion remains similar, and the mean value of the activities decreases. The interquartile range 
(IQA) also remains similar, suggesting that the load remains consistently strenuous. For 
the Infraspinatus, the dispersion of the data increases with higher support torque but the 
mean value decreases. For the Deltoideus Anterior, the dispersion remains relatively con- 
stant with an increase in torque. The aim for an optimal system is to keep the dispersion as 
low as possible to keep muscle activity at a constant low level [13]. In general, the results 
confirm that muscle activity increases with higher load and decreases with higher sup- 
port torque. For the downward movement of the arm, the support torque has less effect on 
muscle activity, as muscle-work has to be done against the system. It should be noted that 
the weight of the exoskeleton was not considered with the assumption that this would not 
have a major impact on the muscle activities considered here. Contact forces were also not 
taken into account for the use case presented. It is assumed that the contact forces have a 
greater influence on the subjective feeling of comfort when wearing a exoskeleton than on 
the measurable muscle activities. Another point of discussion is the way in which the sup- 
port-torque is transmitted to the body or how exactly it acts. In the virtual model (DHM), it 
is a torque in the shoulder hinge without contact points. 


4 Conclusion and Outlook 


This paper has examined the possibility as well as the potentials of using a digital twin 
with regard to the evaluation of different characteristics of a physical support system. In 
the first approach, promising evaluation possibilities could be shown without using a real 
system. The digital twin provides a fast and agile way to investigate user-specific con- 
figurations and derive an optimal support setting, which is essential for the construction 
of a real exoskeleton. The next step will be to deduce the correct mechanical design of 
the exoskeleton based on the optimal support characteristics. For this purpose, mechani- 
cal (active/passive) elements have to be dimensioned to generate the necessary support. 
A possible validation of the design would be the import of the CAD of the exoskeletal 
model in the DHM. 
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Abstract 


Synthetic data is an indispensable supplement to the difficult-to-acquire real data in order 
to meet the substantial demand by machine learning based systems. Data playing the 
key role in machine learning models, its objective and maintainable quality metrics are 
vital for quality assurance of the whole system. This paper introduces a systematic and 
domain-neutral methodology based on formalized scenario variation and experimental 
digital twins for the generation of synthetic data. The methodology uses human-readable 
scenarios and semantically meaningful parameter variations to describe possible entities, 
actions and events to be simulated, whereas experimental digital twins bring the scenarios 
to life by the integration of various domains of a system such as mechanics, sensors, 
actuators and communication under one platform that can be simulated as a whole. The 
scenario description and digital twin simulation is carried out iteratively to derive the 
optimal distribution of synthetic data. Thus scenarios and experimentable digital twins 
can together serve as mediums to systematically cover diverse application scenarios, test 
dangerous situations and find faults within a system. 
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1 Introduction 


The increasing complexity of machine learning (ML) based systems necessitates rigorous 
design and validation approaches to ensure correctness and trust-worthiness of a system. 
Unlike traditional algorithms composed of specified logical rules, ML algorithms are data- 
driven and implicitly derive their own inferences. The “reasoning” and results are often 
unpredictable and difficult to interpret, resulting in the loss of transparency of a system. 
This renders powerful techniques used in traditional software development, e.g. unit testing 
and regression testing, either ineffective or in need of serious modifications. 

The data-collection process can be quite expensive or in some cases impossible, there- 
fore simulations supplement the missing demand by synthetic data. Quality assurance for 
the ML-based system requires a substantial volume as well as verifiable quality metrics 
of the synthetic data. This paper presents a systematic and domain-agnostic methodology 
for synthetic data generation that addresses two aspects of data quality: transparency and 
the diversity of scenarios behind the data. The methodology is based on formal application 
scenario descriptions, appended with formal scenario variation descriptions, and experi- 
mentable digital twins. Application scenarios describe the environment, the entities, actions, 
goals and the initial configuration of an experiment. Conversely, the digital twin of a sys- 
tem is its comprehensive representation, i.e. it collects the set of knowledge representations 
of a system that may belong to different domains and cater to diverse functionalities. The 
digital twin can be simulated within virtual testbeds, platforms that provide various simu- 
lation functionalities, to create the experimentable digital twin (EDT) [14]. The proposed 
methodology integrates the two concepts in an iterative manner. 


2 State of the Art 


To achieve simulation-based variation, a target scenario is typically explicitly modelled 
in the simulation platform of choice, and its parameters are varied accordingly—e.g. the 
steering angle and acceleration of a constant turn rate and acceleration (CTRA) model [16]. 
On the other end, adversarial methodologies are increasingly used to challenge ML-based 
systems, where another ML-system iteratively generates adversarial configurations for the 
system-under-test [5]. 

Both ends of the spectrum miss a generic, platform-independent and semantically mean- 
ingful description of the scenario and parameters to be varied. Jian et al. uses a configurable 
scene grammar to describe static scenes, with stochasticity as part of the description to 
describe the possible scene variation [11]. Fremont et al. developed a probabilistic pro- 
gramming language to describe dynamic scenarios with variation which can be integrated 
into a simulation engine [9]. Fremont et al. [10] uses the formal probabilistic description in 
SCENIC for test-case generation of autonomous vehicle safety scenarios. Another prevalent 
approach for describing dynamic scenes is found in the automotive industry in the Open- 
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SCENARIO standard [2]. The PEGASUS methodology [3] defines logical scenarios as a 
supplement to OpenSCENARIO description to specify parameter variation. In contrast to 
SCENIC, the decoupled description of the scenario and scenario variation offers a higher 
potential for systematization and optimization of scenario variation, as will be seen later in 
this paper. The PEGASUS methodology however does not deal with complex probability 
distributions and inter-parameter constraints for parameter variation which are addressed in 
this paper. 

There is vast literature and frameworks for validation of ML-models by exploring certain 
parameters spaces, regardless if the parameters are semantically meaningful or not. The 
VERIFAI framework allows the user to define an abstract feature space as input, which it 
changes to run falsification test for the ML-model [7]. DeepXplore varies inputs for deep 
learning systems to explore the resulting neuron coverage, and can find the inputs that most 
contribute to differential behavior [13]. 


3 The Scenario Variation Methodology 


As data quality plays a vital role in quality assurance for ML-based systems, the data 
generation process should incorporate maximum transparency and formalism, as with quality 
control for conventional software. Furthermore, the process should allow the identification 
and control of data quality metrics such as data accuracy, understandability, correctness and 
context coverage [8]. The scenario variation methodology, summarized in Fig. 1, affords the 
designer control over these factors via a systematic workflow and semantically meaningful 
control parameters. The following sections go through each of the steps. 


x Scenario Design 
Expert H Concrete Scenarios 


i Scenario 
Knowledge + Variation 
(8) B Logical Scenario 
' Scenario Scenario 
Configuration Evaluation 


Replay Data 


Historical ^, 
Data 


Trained Model 


ML Evaluation (qa ML Training 


i ML Training and Validation 


Fig. 1 The scenario variation methodology for synthetic data generation 
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3.1 Scenario Configuration 


The scenario configuration stage involves the definition of the basic application scenario. 
This paper uses the classification of Dahmen et al. by classifying scenarios into abstract, 
logical and concrete scenarios [6]. 


Abstract Scenario The abstract scenario provides the description of an environment and 
defines the participating entities, actions and goals. Certain parameters at this level are 
abstract, i.e. either undefined or assigned preliminary values. The abstract scenario must 
be specified in ahuman-readable and formal syntax (e.g. a standardized XML-Schema like 
OpenSCENARIO [2]), be semantically complete and consistent. For example, the abstract 
scenario of a vehicle performing a lane-change maneuver may be (informally) described 
with abstract parameters pı — pa: 


Given road with pı lanes, the actor car with the initial position on lane pa and velocity p3 
moves to lane pı — 1 after p4 minutes have passed. 


Logical Scenario The logical scenario uses the abstract parameters to specify rules for 
scenario variation, and likewise follows a formal syntax and is semantically meaningful. 
Magbool et al. [12] introduced an XML-based test specification to define logical scenar- 
ios via a dedicated meta-model that allows a hierarchical modeling of parameters ranges, 
probability distributions, inter-parameter mathematical and logical constraints. The exam- 
ple in Scenario | illustrates this approach. A generic speed distribution element is defined 
for vehicle speeds in urban settings. The two abstract parameters, speed_vehicle_l and 
speed_vehicle_2 inherit the attributes of this element, and speed_vehicle_2 overwrites the 
distribution. A mathematical constraint is additionally specified between the abstract param- 
eters - regardless of the chosen values of the abstract parameters, the constraint must hold. 


Scenario 1 Example of a logical scenario 


Define: vehicle_speed_urban 
1: range < {20, 50}km/h 
2: distribution < Gaussian{u = 30, o? = 4} 


Parameter Variation: 

3: speed_vehicle_1 < vehicle_speed_urban 

4: speed_vehicle_2 < vehicle_speed_urban 

X overwrite distribution <- Gaussian{u = 25,0? = 4} 


Constraints: 
6: speed_vehicle_1 > speed_vehicle_2 +10 
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Logical Scenario Design The logical scenario discussed above is well-equipped to gen- 
erate possible, impossible, probable and improbable scenarios. As Fig. 1 illustrates, both 
domain expertise and historical data may be taken as sources for the design of a logical sce- 
nario. Examples of logical scenario design by domain expertise are exemplified in [17] where 
sets of possible values of parameters are derived by listing and clustering the pre-conceived 
situations the system may encounter. An example of design by historical data can be seen in 
[16], where a driving study from BMW is used to estimate probable driver inputs for a car 
within a sharp curve. The logical scenario methodology can fully support both approaches 
via specification of parameter distributions and constraints while simultaneously using a 
platform-independent and formal syntax to do so. The third method for logical scenario 
design illustrated in Fig. 1 via feedback from scenario evaluation is discussed in Sect. 3.4. 


3.2 Scenario Variation 


The scenario variation stage use the logical scenarios to generate concrete scenarios. Con- 
crete scenarios have concrete values for the previously abstract parameters, distributed 
according to the logical scenario specification. This stage uses sampling techniques to gener- 
ate samples distributed as close as possible to the specified parameter space. The contribution 
uses the variants of Markov-Chain-Monte-Carlo proposed by Maqbool et al. [12] to generate 
the samples. 

The scenario based approach with decoupled abstract, logical and concrete scenarios help 
to impart understandability to ML data, as concrete scenarios provide a unique, formal and 
human-readable basis behind each data-set. Secondly, the distributions and constraints offer 
control over data accuracy—they can control the similarity between concrete scenarios and 
the desired realistic distribution. Accuracy of the data is further ensured by the digital twin 
approach for simulating the concrete scenarios, discussed in the next section. 


3.3 Scenario Evaluation 


The scenario evaluation stage brings the concrete scenarios to life using simulation tech- 
niques. The authors propose the use of experimentable digital twins (EDT) to match the 
flexible and multi-domain nature of the scenarios. The EDT of a system is the digital twin 
implemented as a simulation model in a virtual testbed that offers diverse simulation func- 
tionalities. EDTs collect various aspects of the system and can be easily reconfigured for 
different application contexts throughout the training and validation process. Additionally, 
EDTs offer scalability in the level of detail (e.g. simulation realism, sensor resolution) and 
computing resources [15]. Figure 2 illustrates the EDT for a rover on an extra-terrestrial 
terrain modeled in the multi-domain simulation software VEROSIM. The figure illustrates 


134 O. Maqbool and J.Roßmann 


Fig. 2 Diverse sensors 
mounted on an extra-terrestrial 
rover. The sensor positions are 
labeled, and the black box 
illustrates the rendered output 
of the camera and 
stereo-camera 


how the EDT-based simulation allows the fusion of environment generation, multi-body 
dynamics and various perception sensors. 

The flexible and modular nature of EDTs make them an ideal fit for the parameterized 
and iterative scenario evaluation methodology in Fig. 1. For instance, scenario design iter- 
ations can be performed on simplistic models without expensive sensor rendering and can 
be seamlessly upgraded per requirement in subsequent iterations. EDT-based scenario eval- 
uation stage generates the ground truth data and the replay data. The ground truth data is 
annotated and labeled by the simulation and serves as ML input data, whereas the replay 
data contains the simulation events and results that may be used for the post-analysis of a 
particular simulation run. Thus EDTs further impart control over the accuracy and coherency 
of the synthetic data-sets by flexibility in the realism, scope and configuration of simulation 
entities. 


3.4 Scenario Redesign 


As previously mentioned, design-by-domain-expertise and design-by-historical-data are not 
always feasible in practical scenario design. Simulation results provide a valuable insight 
into the effect of parameters and opens the way towards iterative scenario design. Parame- 
terization via logical scenarios makes every scenario viable for iterative redesign, and this 
iterative process can be carried out by a domain expert or an optimization algorithm. Con- 
sider the example (illustrated in detail in Sect.4) of an automotive simulation, where the 
ML-designer requires data-sets from both accidental and non-accidental situations, but the 
desired scenario distribution for such data-sets is unknown. Random simulation within the 
complete parameter space can offer insight and allow the scenario designer to set the desired 
parameter bounds. Various optimization- and heuristic-based algorithms can be used for this 
purpose [1, 4]. 
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3.5 ML Training and Validation 


Once the concrete scenarios in the scenario design phase have acquired the sufficient char- 
acteristics, the EDT-based simulations can be used to generate ground truth or input data 
for training and validating ML-based systems. The scenario variation methodology suggests 
another feedback loop after training or validating the ML-system. This loop can be utilized, 
e.g. to iterativly find critical scenarios for the ML-model using the same heuristics as in 
Sect. 3.4. 


4 Application Examples 


Two examples from the space- and automotive domain are presented to illustrate the multi- 
domain capability of the scenario variation methodology. Within both examples, the Open- 
SCENARIO standard is adapted to describe the abstract scenario, whereas the logical sce- 
nario is specified via the test specification in [12]. VEROSIM is used as the EDT-based 
simulation software. 


Rendezvous and Docking Scenario The rendezvous and docking maneuver (RvD) maneu- 
ver, illustrated in Fig. 3a requires a chaser shuttle to scan a target satellite via LIDAR and 
determine the relative pose. The ML-based pose-estimation algorithm is to be trained via 
synthetic LiDAR scans with ground-truth information. The specifications of the LiDAR 
scanner by the chaser and the ML-model posit two constraints. Firstly, all measurements 
must be taken such that the chaser is within a flight corridor. The flight corridor is specified 
via a cone, with its apex on the satellite, length Jc and radius rc, see Fig. 3b. Secondly, the 
closer the chaser is to the satellite, the higher the likelihood of it being on the center. The 
simulated datasets should reflect this distribution. 


(a) The rendezvous (b) The logical scenario (c) Concrete scenarios - 
and docking sce- problem, square is satel- each red dot illustrates a 
nario. lite and dot is chaser concrete chaser position 


Fig. 3 Logical scenario design of a satellite docking maneuver 
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Scenario 2 Logical scenario for rendezvous and docking scenario 
Parameter Variation: 


1: Xchaser» Ychaser; Zchaser 
2: distribution = RvD_Dist{rc, lc} 


Constraints: 

3: 0 <n.d< leorridor 

4: arccos 24. < 6; 0<0 <x 
nlldl 


The abstract scenario is modeled via teleport action of OpenSCENARIO—both chaser and 
satellite are teleported to initial positions, whereas the initial coordinates of the chaser are 
defined as abstract parameters. The logical scenario is specified as in Scenario 2. Lines 
3—4 enforce requirement 1 via inter-parameter constraints. As Fig. 3 illustrates, given unit 
vector n along the corridor axis, vector d from the target to the chaser, the dot product 
of the vectors—the distance between the target and chaser along the corridor axis—must 
be less than the corridor length. Secondly, the angle between 7 and d must be less than 
the corridor angle 6, so that the chaser is always within the corridor bounds. Lines 5-6 
implement requirement 2. The “RvD_Dist” is implemented in the scenario variation engine 
by extending the meta-model in [12], and is simply referred to in the logical scenario. The 
implementation uses a gaussian distribution dependent on the distance between the chaser 
and target. The resulting concrete scenarios generated with unique chaser positions are 
illustrated in Fig. 3c. 


Automotive Collision Avoidance In the second use-case, a collision avoidance ML-model 
must maneuver a vehicle to avoid an incoming truck via an evasive maneuver. The ML- 
model needs sufficient samples of both collision and non-collision scenarios for training, 
otherwise it runs the risk of over-fitting to a particular case. To find out the target logical 
scenario, an initial logical scenario is set up with the velocity of the car v and the point of 
curvature Parc as abstract parameters. The point of curvature is the evasion inducing point 
within the spline trajectory of the vehicle. The first iteration of logical scenario, illustrated 
in Scenario3 assigns suitable ranges to two abstract parameters with uniform probability 
distribution functions (PDF). The resulting concrete scenarios and their EDT simulations are 
illustrated in Fig. 4a and b respectively. The percentage of no-collision scenarios is relatively 
much lower than collision scenarios, which may cause ML-model to over-fit to no-collision 
scenarios. Based on the results, the next iteration of logical scenario design can either impose 
newer parameter ranges, or an appropriate PDF to ensure sufficiency of both collision and 
non-collision scenarios. A bi-modal gaussian PDF with two means located within the highest 
collision and no-collision densities is chosen. The gaussian PDF allows the scenario designer 
more flexibility by providing a finer balance between the area of the sampling region and 
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Fig.4 Iterative scenario design 
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the frequency of outlier sampling. The logical scenario is formulated in the second iteration 
of Scenario 3. The resulting concrete scenarios and simulations are illustrated in Fig. 4c and 
d, and show an equal distribution of accident and no-accident scenarios. With the desired 
logical distribution now found, the number of concrete scenarios can be further increased 
and the simulation can be made further complex by adding realistic sensor EDTs. 


Scenario 3 Iterative logical scenario design for automotive collision avoidance 
Iteration 1 

1: v 

2: distribution <— uniform {8,20} 

3: Parc 

4: distribution <- uniform {0,30} 

Iteration 2 

5: v, Parc 

6: distribution < Gaussian {au = 17, un = 17, o1 = 1.5, 02 = 1.5} 


5 Conclusions 


This contribution introduced a methodology for synthetic data generation based on for- 
mal scenarios, semantic parameter variation and experimentable digital twins (EDT). The 
methodology provides transparency and formality to the data generation process, and 
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delivers control over data quality via scenario distribution and EDT configurations. A 
human-readable concrete scenario behind each synthetic data imparts a higher degree of 
understanding about the data. The proposed logical scenarios allow a formal scenario distri- 
bution specification. They can support domain expertise, historical data, as well as iterative 
methods to derive the scenario distribution. EDTs concurrently provide a simulation platform 
to simulate the scenarios throughout the scenario design process, offering high flexibility in 
simulation perspective, complexity and scale. Future works plan to carry out further research 
on iterative design of logical scenarios using metrics from trained machine learning models, 
and explore techniques to derive exploratory, exploitative and adversarial logical scenarios. 
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Abstract 


The architecture, engineering and construction (AEC) industry appears hesitant to 
embrace new digital innovations. One of the few recent successful examples is the intro- 
duction of the building information modeling (BIM) paradigm. However, the focus here 
lies mainly on the building itself and does not support the construction environment. This 
paper presents a methodology for the development and application of Digital Twins rep- 
resenting and supporting the working environment of a construction site. By combining 
available geodata with real-time sensor data from mobile construction machines, it is 
possible to create always-up-to-date Digital Twins of the relevant objects and processes 
in the field in order to facilitate supervision, additional planning steps, management, con- 
trol and security activities. The proposed concept is currently being tested on a local test 
site to generate, update and adapt the Digital Twins as well as to incorporate additional 
semantic information about e.g. the soil and various working processes. 
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1 Introduction 


The architecture, engineering and construction (AEC) industry is one of the least digitized 
industries [12] and only slowly embraces paradigms, such as Building Information Model- 
ing (BIM). BIM emphasizes the processes and technologies to create a digital model, that 
represents the physical and functional characteristics of the project. In its present form, 
BIM targets mainly the building, but as Choi et al. [6] point out, the work space and the 
building environment is an important resource for managing a construction site, as well. 
The introduction of environmental information requires integration of BIM and Geographic 
Information Systems (GIS), which however brings new challenges, as GIS and BIM follow 
different paradigms [1]. 

Moreover, such systems are limited by their static representation of a building and its 
environment [13], as they lack any automatic information flow from the construction site to 
the GIS+BIM model. This synchronization between real assets and their digital model is a 
key feature of a Digital Twin (DT). The concept of DTs describes a virtual representation 
of a physical object, and the corresponding flow of data between these two parts [10]. It 
is a major part of the Industry 4.0 roadmap, already embraced by other industries such as 
manufacturing and production, whereas its development in AEC is still in infant stages, as 
Deng et al. [7] state. The majority of current research is exploring the integration of BIM 
models and sensors, but their focus lies on building related topics during the operation phase, 
such as indoor hazards monitoring, thus again leaving out the building environment during 
the construction phase. 

This paper presents a methodology for establishing and operating a DT of the environment 
of construction sites. Inspired by GIS applications, it uses geodata to initialize the DT and 
employs mobile construction machines to create information flow from the environment to 
the DT. This methodology is then tested on a local test site. 


2 State of the Art 


The majority of current research does not directly consider the environment of construction 
sites, but touches on it in efforts to automate site monitoring. In their literature review, Boje et 
al. [2] gather several examples, such as the use of drones or laser-scanning to capture and save 
changes in construction status to a BIM system. Although there is progress in automating the 
integration of captured data into the chosen building model, the act of operating the drone 
or preparing and capturing laser scans is manual. Moreover, all examples focused mainly 
on the constructed object and provided only visual evolution. 

Xu et al. [15] employed already existing construction site models to solve multi-objective 
dynamic construction site layout problems, but considered only facility or process related 
features, such as safety and environmental hazards posed by a facility. Song et al. [14] 
developed an automated tool to calculate optimized equipment travel paths using the site 


Living Earth—A Methodology for Modeling the Environment ... 143 


layout within BIMs, however they assumed a 2D flat surface and square obstacles without 
consideration of elevation or ground properties. 

One exception is the work by Cheung and Lin [5], which assessed the level of hazardous 
gases around the construction site using a Wireless Sensor Network. Although this idea 
represents dynamic updates of the environmental state, only a single attribute is monitored 
without integration into a more extensive model of the environment. Similarly, Arroyo et 
al. [1] examined the use of geological shallow subsurface data for construction and design 
applications, which is only a limited subset of properties of the construction environment. 

This analysis indicates that the area of DTs for environmental modeling of construction 
sites has not yet been extensively explored, as the current focus lies mainly on the con- 
structed building, without taking the rest of the site into consideration. Additionally, there 
are hardly any automatic approaches for updating the state of the environmental model, 
which is required by the DT paradigm. 


3 Methodology 


This section will provide a closer look into the steps of the described process in Fig. 3 and 
their implementation in the preliminary study currently being carried out on a local test 


Available Geodata Geodata (or geographic data) is data that can be referenced by a loca- 
tion relative to the Earth. Two most common types of geodata are rasterized and vectorized 
data. The former saves values like height, e.g. in digital elevation models (DEM) (Fig. 2), 
or color, e.g. in digital orthophotos (DOP) (Fig. 1), on a regular grid. By georeferencing 
the ‘origin’ and defining the resolution of the grid, every cell can be easily accessed by its 
geographic position. The latter type explicitly defines the geometry of objects by describing 
them with geometrical primitives. Every vertex is then assigned geographic coordinates. 
This work suggests using following geodata, depending on its quality and availability: 
Digital Surface Models (DSM), Digital Elevation Models (DEM), Digital Orthophotos 
(DOP), topographical maps, geological maps, city maps and building models. Such data 
is usually accessible via public databases, such as ‘OpenGeodata.NRW’, the official geo- 
database of the government of North Rhine-Westphalia, that was mainly used in this work. 
The ‘Open Geospatial Consortium’, an authority on geospacial information, offers free and 
open standards for interaction with such databases via its ‘OGC Web Services’. Geodata can 
come in different formats, so for consistency and simplified interaction with the future Digi- 
tal Twin, this work suggests transforming and combining rasterized data into a multilayered 
GeoTIFF file, an official OGC standard that offers functionalities for georeferenced raster 
data. A similar tactic is applied to vectorized data to transform them into the CityJSON 
format. 
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Fig. 1 Digital Orthophoto and 
building model of the test site 
area. Building outlines are 
colored in pink and can be seen 
in the top right corner 


Fig.2 Digital Elevation Model 
of the test site area 


The described data offers a starting point for the DT. However public geodata sometimes 
lacks quality, especially its temporal resolution can be in the range of several years and the 
data is thus quickly outdated. Figures | and 2 illustrate this nicely, as they were taken before 
creation of the construction site and therefore show only an empty field. This issue calls for 
more frequent updates, ideally in real time, to create a true DT of the environment. 


Live Data Acquisition Mobile construction vehicles are a promising platform for gathering 
live data about the construction site environment. They frequently traverse and directly 
interact with it by e.g. excavating earth or transporting material. Moreover, modern machines 
are already equipped with different sensors. Most sensors are typically used for condition 
monitoring of the vehicle [9] thus measuring internal data and focusing on the machine, our 
case, however, focuses on the environment. 

This situation calls for two solutions: inference of external, environmental data from avail- 
able internal sensors and equipping the construction vehicles with additional sensors. This 
work implemented both strategies. A wheel loader, a typical mobile construction machine 
for moving material using a front mounted bucket, was equipped with following sensors: 
Inertial Measurement Unit IMU), Global Positioning System (GPS) tracker, RGB camera, 


Living Earth—A Methodology for Modeling the Environment ... 145 


Light Detection and Ranging (LiDAR) sensors, pressure sensors of the hydraulic cylinders, 
stroke transducers of the hydraulic cylinders, wheel encoders, measuring the rotation of all 
four wheels. All measurements are indexed by their unique UNIX timestamps. Depending 
on the accuracy of the GPS sensor, it can be necessary to correct the measured positions. 
This can be achieved by fusing GPS and IMU readings as described for example in [3]. 
After this optional preprocessing step, a set of measurements, each with a unique time and 
clear geographical position, is ready for further analysis. 


Data Analysis An effective data analysis strategy is key to extract information from the 
previously gathered data. As construction progresses, the site experiences changes in mate- 
rial placement, object placement, soil conditions and the general surface model and thus 
these changes are of special interest for an up-to-date DT. 


Surface Model The use of IMU and LiDAR sensors enables Simultaneous Location And 
Mapping (SLAM) based approaches. Such algorithms try to localize the robot (or machine) 
within its surroundings, while building a map of those surroundings at the same time. Exactly 
such 3D map of the environment allows for updates of the surface model. The generated 
point clouds can then be rasterized by placing them on a regular grid and taking the com- 
bination of height values of all points within a cell. This work applied the average height 
within every cell. Additionally, the differences in resolution of the point cloud and the grid 
can lead to some cells being empty, as points get sparser with distance to the LiDAR or the 
rays are simply obstructed by the roughness of the terrain. A straight-forward solution is to 
use interpolation to estimate missing cell values. 


Terrain Condition Besides elevation, the type of terrain is important information about the 
environment of a construction site. One possible approach is done by Kurup et al. [11], who 
propose a support-vector-machine based algorithm to classify different terrains with features 
from camera images and IMU readings. Differences in vehicle speed and the rotational speed 
of its wheels indicates some form of slip. This information can again be combined with cor- 
responding GPS readings to detect areas with difficult to traverse ground. Soil compaction 
is another terrain property that can be inferred from the movement and position of the vehi- 
cle around the construction site as the wheels exert pressure on the ground. All extracted 
information about the properties of the terrain is then added to the corresponding layer of 
the grid describing the environment (Fig. 3). 


Material Transport Of course, the main changes in the environment are caused by excava- 
tion and transport of material performed by the machinery. Stroke transducers and pressure 
sensors yield the actuator position of e.g. the bucket of a wheel loader. Since the kinematics 
of the vehicle and its current geographical position are known, the location of removed or 
placed material can be observed. Volume and mass of the transported material can be then 
estimated from its physical properties and technical specifications of the used machine, e.g. 
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Fig.4 Example of a GeoJSON (À 
object 


[6.063219748518841, 
50.784869006931210], 
[6.063502473932873, 
50.784797692370596] 


its bucket volume. Detected material, e.g. a heap of topsoil, can then be described as geo- 
graphic objects in vector format (see Sect. 3), containing its coordinates and attributes. 


Object Localization Finally, all objects that are not terrain or earth material also have to be 
considered, such as placement of fences or building material. One strategy is to use RGB 
cameras and LiDARs to detect and localize objects relative to the vehicle [4, 8]. This is, 
however, only tried out in context of autonomous driving and more datasets with objects 
from construction sites are needed. Localized objects should be saved in vector format. At 
the end of this data analysis step, environment measurements from construction vehicles are 
transformed into information in vector and raster form and ready to be incorporated into the 
DT. 


Structure of Digital Twin As discussed in Sect.3, geodata is usually handled with two 
different formats, vector and raster data. The proposed structure embraces this distinction 
and splits the DT in two parts: A multilayered GeoTIFF file and a database for vector objects. 

The GeoTIFF format is suitable for storing the raster part of the current environmental 
state due to its native integration of geospatial location, reference coordinate system and 
possibility to include multiple layers into one file. Since every layer shares the same grid 
structure, this format enforces consistency between different layers in regard to resolution 
and geographic location. The GeoTIFF file could contain the following selection of layers: 
DSM, DEM, red, green and blue channels of color data, terrain type map, soil compactness 
map, geological map. The second part of the environmental model stores all distinct geo- 
graphical objects in a JSON document database. We propose using the GeoJSON format. 
Figure 4 shows an example for such an object. It consists of a set of user defined properties 
and primitive geometries referenced by geographic coordinates. In this case, it is a “fence’- 
type object in from of a line with additional information about its height. Together with the 
previously mentioned CityJSON objects (see Sect.3), the DT needs only one database to 
manage all vector data, allowing for efficient and consistent access, searching and editing. 

With this structure, the information about the current state of the environmental model can 
then be used to support future updates of the DT by combining them with new measurements. 
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This information can also be exported back into public geodatabases improving and updating 
their data. 


Interfaces and Applications A complete and consistent DT acts as the single source of 
truth for the state of the construction site. Through a selection of suitable interfaces and 
representations, such as 3D models, maps or dashboards and statistics, the site operator is 
supported in their monitoring, controlling and planning tasks. This constitutes an indirect 
connection from the Digital Twin back to the Real Twin, as its state directly influences the 
actions of the user, who in turn changes the state of the real construction site through their 
managerial decisions. Possible applications can be optimization of route planning based on 
elevation and ground properties or editing the site layout after changes in the construction 
plan. 


4 Experiments 


The presented methodology was implemented in a preliminary study, using a wheel loader 
and local test site. During initialization, all available data sources depicted the site as an 
empty and flat area. The machine then performed several test drives around the area and 
carried out typical construction tasks. It was equipped with sensors described in Sect. 3 to 
emulate one update pass of the digital twin. 


4.1 Surface Model 


Figure 5 demonstrates the information gain from LiDAR recordings. The point cloud records 
the area in front of the machine. For better visibility, points are colored based on their height. 
At some point in time, a depression in the ground could be clearly detected ahead of the 
vehicle, indicated by the blue region, as well as some tall vegetation in the form of the 
ragged yellow-orange structures on the right. Neither of those were present in the initial 
surface model. The recorded point cloud was then rasterized, linearly interpolating empty 
pixels (Fig. 6) and used to update the corresponding area of the global DSM, as the current 
geographical position of the vehicle is known. 


4.2 Slip and Soil Compaction 


Figure 7 shows a qualitative map of slip in the area traversed during operation of the machine. 
The measure of slip was defined as the difference in rotation frequency between wheels on 
mostly straight movement segments. Darker regions show areas with more slip. Figure 8 
shows a quantitative map of soil compaction, which was caused by the pressure of the 
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Fig.5 Recorded point cloud during wheel loader operation. Points are colored based on their height. 
The white symbol indicates the position of the viewer. A sink can be clearly observed ahead of the 
vehicle, indicated by the blue region. On the right-hand side, the rough and yellow-orange patches 
indicate vegetation 


Fig.6 Elevation map generated 
from the point cloud in Fig. 5 
with 10cm resolution. Empty 
grid cells were linearly 
interpolated 


wheels of the vehicle on the ground. Additionally, crossing the same spot multiple times 
and the added weight of transported material were also considered in the estimation. Darker 
regions show higher level of compaction. The dark patch on the right side of the figure shows 
an area where the wheel loader piled up and subsequently removed a heap of earth. This 
new environmental information isn’t normally available and offers new possibilities for e.g. 
future route planning and is added to the corresponding layers of the DT. 
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Fig.7 Qualitative map of slip 
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Fig.8 Qualitative map of 
created soil compaction 
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5 Discussion 


The experiments demonstrated potential for employing a DT in context of construction site 
environments. Each test drive can be seen as a single update pass from the real environment 
to the DT. In a real use case, these updates should happen continuously and near real-time 
during the whole construction process. The implementation of an automatic system for 
updating the DT needs to be developed and tested in further studies. 

Constant updates provide the possibility to monitor the evolution of the construction site, 
as the DT can store its past states, which can be a powerful controlling tool. However, it raises 
ethical questions about surveillance of construction site personnel, like machine operators, 
since their behavior is indirectly recorded through their machine and thus also stored inside 
the DT. Another issue can arise in using the full potential of the DT and procure its latest 
geodata to update databases of public bodies. It is important to manage the distinction of 
confidential business data and data cleared for public use before their export. 

The methodology describes only the essential structure of an environmental DT. Based 
on the use case, the proposed contents of the DT could be extended or further transformed 
to fit a certain application. Additional information, like the target layout of the site or safety 
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thresholds on slip values, will enable detection of deviations in the current layout or provide 
insights for path planning, respectively. 


6 Conclusion 


We proposed a methodology for modeling the environment of construction sites via a Digital 
Twin (DT). After the DT is initialized with available geodata, it needs to be coupled with 
information about events happening right now at the construction site. Using construction 
vehicles equipped with different sensors, raw data about the environment can then be gath- 
ered. Next, this data is fused and analyzed to derive the desired information, such as an 
elevation model or soil condition. This finally updates the previously initialized DT. Every 
new set of measurements can then be combined with current information from the DT to 
enhance future updates. The cycle is finally closed with a range of applications, such as 
monitoring and controlling. The DT can ‘act’ on the real environment through the manage- 
ment and planning decisions of the user, supported by suitable human-machine interfaces. 
One update loop of this methodology was then implemented on a test site using a wheel 
loader. Different information, such as changes in surface model, regions with slip and soil 
compaction, have been derived from sensor measurements during vehicle operation and 
added into the existing environmental model of the test site. 

Future research will include the implementation and evaluation of a fully connected 
infrastructure to automate the measurement, analysis and update steps. In combination with 
that, other sensors, such as radar or stereocameras, should be tested for deployment on con- 
struction machinery. In the long term, the extension of the environmental DTs to incorporate 
DTs of constructed objects and the construction machines themselves will offer new topics 
for investigation. 
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Analyzing Natural Resting Aspects 
of Arbitrary Components Using a Physics 
Engine 
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Abstract 


Part Feeding Systems play a vital role in automated assembly, linking in-house logis- 
tics with individual assembly stations. One of the main tasks of part feeding systems 
is to transfer components from a disordered state (e.g. bulk material) to an ordered 
state (defined position and orientation) so that they can be further processed by auto- 
mated handling equipment. Knowledge of the natural resting aspects (probability that 
a geometrical body rests in a certain orientation) of the components is essential for 
the development and design of part feeding systems. The experimental determina- 
tion of natural resting aspects is time consuming and expensive since extensive drop 
tests have to be carried out. Therefore, many approaches have been taken to derive 
the natural resting aspects mathematically based on the component geometry or by 
direct dynamic simulation. In this work, the open-source physics engine Blender is 
used to determine natural resting aspects of arbitrary components without the need for 
experimental drop tests. In virtual drop tests, components are imported in the com- 
mon STL format and are dropped on a surface from random initial orientations. The 
resting orientations of the components are exported and automatically evaluated using 
MATLAB. The functionality and accuracy of the approach is evaluated by conducting 
experimental drop tests with five exemplary components. The evaluation shows good 
agreement between simulated and experimental results. 
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1 Introduction 


Modern production environments are characterized by an increasing level of automation. 
Especially in assembly processes, there is still a lot of potential to enhance efficiency and 
productivity by automating assembly processes. Key components of automated assembly 
systems are part feeding systems which transfer the assembly components from a disor- 
dered state (e.g. bulk material) to an ordered state, with a defined orientation and posi- 
tion. An example for widely used devices are vibratory bowl feeders [1]. They consist of 
a vibrating bowl with a spiral track. Due to the vibration, the components are transported 
up the track, which is equipped with different chicanes or traps that sort out (reject) com- 
ponents in undesired orientations. In order to achieve high feeding rates, the amount of 
rejected components should be minimized. Therefore, knowledge of the probabilities for 
a component to naturally adopt certain orientations (natural resting aspects, cf. Fig. 1) is 
essential for an efficient design of vibratory bowl feeders [2]. The design of the chicanes 
must allow components in highly probable orientations to pass and reject components in 
unlikely orientations. Apart from vibratory bowl feeders, knowledge of the natural rest- 
ing aspects of a component is also valuable for the design of various types of conveying 
and part feeding systems like linear feeders, camera-based pick-and-place systems, or 
aerodynamic feeding systems [3]. 

The experimental determination of the natural resting aspects in manual drop tests is 
very time-consuming because a component has to be dropped several hundred times and 
the resulting resting orientations have to be documented manually. This work presents 
a novel method for an automated determination of the natural resting aspects of arbi- 
trary components with the use of a physics engine. In the first step, the component is 
imported into a physics engine in the common STL format (Sect. 3.1). Then, the virtual 
component is repeatedly dropped on a surface from a defined height with a random ini- 
tial orientation (Sect. 3.2). The component bounces off the surface, changing orientation 


Resting aspect | Resting aspect 2 Resting aspect 3 Resting aspect 4 Resting aspect 5 


Fig. 1 Natural resting aspects of an L-shaped exemplary component 


Analyzing Natural Resting Aspects of Arbitrary ... 157 


multiple times. When the component comes to a rest, its orientation is exported for 
further processing (Sect. 3.3). The extraction of the natural resting aspects from the 
exported data is explained in Sect. 4. Ultimately, the experimental evaluation is presented 
(Sect. 5) and the results are discussed (Sect. 6). 


2 Related Work 


In one of the first works regarding model-based determination of natural resting aspects, 
Boothroyd and Ho used the energy barrier between different orientations (resting 
aspects) of a component as an indicator for the stability of their resting aspects [2]. They 
assumed that the probability, a component comes to rest in a particular orientation is pro- 
portional to the energy needed to change the orientation. Boothroyd and Ho applied the 
method to simple, regular prismatic and cylindrical components and validated it experi- 
mentally with drop tests. The results showed high consistency [2]. However, even though 
the method is sometimes referenced for comparison, it was not developed any further (cf. 
[4]) and is strongly limited with regard to the component complexity. 

To compensate the limitations mentioned above, Ngoi et al. presented the centroid 
solid angle (CSA) method [5]. Ngoi et al. proposed that the probability in which a com- 
ponent comes to rest on any of the feasible aspects is proportional to the solid angle 
from the centroid of the component to the considered aspects. They also proposed that 
the aforementioned probability is also inversely proportional to the height of the cen- 
troid from the considered aspect. In [5], the CSA method is compared to Boothroyd 
and Ho’s energy barrier method and successfully validated with experimental drop tests 
using a T-shaped prism as exemplary component. In following works, Ngoi et al. refined 
and evaluated the CSA method using different exemplary components, with a displaced 
center of gravity or form elements like bores or grooves, for example [6]. In [7], the 
method and the drop test results were also validated using a vibratory bowl feeder. 

In [8], Ngoi et al. introduced and evaluated the critical solid angle (CRSA) method. 
For the CRSA method, they assumed that the probability that a component comes to rest 
on a certain aspect is proportional to the difference between the centroid solid angle of 
that aspect and the average of the critical solid angles of the surrounding aspects. The 
critical solid angle between two aspects of a component is determined by the criti- 
cal position of the centroid when tilting the component from one aspect to another. A 
detailed explanation of the CRSA method is given in [8]. 

Chua and Tay developed the stability method, which analyzes the stability of a com- 
ponent when resting on a certain aspect. The stability is defined as a function of the con- 
tact area of the aspect with the surface the component is resting on and the distance of 
the center of gravity to said surface [9]. The stability is proportional to the contact area 
and inversely proportional to the distance of the center of gravity. 
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The described methods (energy barrier, CSA, CRSA and stability) were evaluated by 
Suresh et al. [10] and Udhayakumar et al. [11] using brake pads and sector shaped parts 
as exemplary components. The results show good accuracy for all methods. 

In contrast to the analytical methods described above, Moll and Erdmann introduced 
a numerical approach to simulate drop tests with the aim to determine the optimal drop 
height and surface shape for a component to rest on a certain aspect [12]. They used a 
dynamic simulator to simulate the behavior of polyhedral rigid bodies when dropped on 
a surface from a defined height. However, the dynamic simulator was limited to two- 
dimensional components. 

Várkonyi introduced numerical dynamic simulation to determine the natural resting 
aspects of randomly generated three-dimensional polyhedra [13]. The aim of this work 
was to create a dataset to compare the accuracy of three existing analytical methods and 
three estimators (developed by Varkonyi). Experimental results showed good agreement, 
when the components were dropped on a hard surface, but significant deviations, when 
the components were dropped on a soft surface. Boothroyd and Ho define a hard sur- 
face (e.g. metal, glass) as a surface with a negligible horizontal impact force (no friction) 
as opposed to a soft surface (e.g. rubber), where significant horizontal forces occur on 
impact [2]. For the dynamic simulation, Varkonyi only considered vertical impact forces, 
resulting in frictionless impacts, limiting the model to the simulation of drop tests on 
hard surfaces. 

The review of the related work shows that there are multiple approaches towards a 
model-based analysis of natural resting aspects. However, the presented approaches are 
limited either with regard to the component spectrum, the usability and the adaptability 
of the simulated environment (e.g. surface properties, surface geometry). To counteract 
these limitations, a novel, more adaptable approach is presented in the following. 


3 Drop Test Simulation Using a Physics Engine 


In this paper, the physics engine integrated in Blender [14] is used for the simulation 
of the natural component behavior. The software is used to simulate a standard drop 
test where a component is dropped onto a soft surface from a constant height. Using 
the open-source physics engine Blender promises multiple advantages: The components 
can be imported in the common STL-format, which results in an accurate representa- 
tion of the simulated component and a high flexibility with regard to the component 
spectrum. Furthermore, Blender offers Python-based script control, which enables full 
automation of the iterative drop test simulation. Lastly, the Environment can be adapted 
freely, meaning that the shape, inclination and other parameters of the surface can be 
adapted freely and walls or other restricting objects can be placed in the virtual setup. In 
this work, the framework for the drop test simulation and the identification of the natural 
resting aspects is presented and evaluated experimentally. 
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The drop test is performed for n iterations with random initial orientations and yields 
a distribution of the natural resting aspects of arbitrary components. Blender is particu- 
larly well suited for integrating the simulation model into a statistical test design due to 
its dynamic script control in Phython. Figure 2 shows the general program flow for per- 
forming drop tests of a component. The program is divided into the simulation environ- 
ment preparation (Sect. 3.1), the simulation run (Sect. 3.2), and the export of the rotation 
data (Sect. 3.3). 


3.1 Preparation of the Simulation Environment 


Firstly, the simulation environment is automatically set up in Blender using a python 
script. To do this, the plane on which the components fall, as well as the component 
itself, are imported as STL files. The local component coordinate system (CS) w is then 
relocated to the component’s center of gravity and aligned with the inertial coordinate 
system (CS). After that, the component is moved to the constant drop height A with the 
vector orw = (0,0, 60)”mm. Lastly, the component is randomly oriented in R? with uni- 
form distribution (cf. Figs. 3 and 4a, b). 

In the following, the simulation boundary conditions are defined. The simulation is a 
rigid body simulation, all components are modeled as ideal solid bodies and component 
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Fig.2 Flow chart of drop test simulation 
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Fig.3 Initial position and orientation of a component in the simulation environment 


(a) f =0 
(CS) w 


Fig. 4 Coordinate Transformation during the simulation 


deformation is neglected. The falling component is characterized as an active rigid body 
and is assigned a mass according to its density. It can move freely in R? with 6 DOF. 
The plane, on the other hand, is defined as a passive rigid body and thus fixed in space. 
Subsequently, the gravitational field is defined with a constant acceleration of g = 9.815 
in negative Z-axis of (CS) . Finally, the interaction of the two rigid bodies is determined. 
The surface response (bounciness and friction) is particularly important here. The boun- 
ciness b (0 < b < 1) describes the tendency of a rigid body to bounce after colliding with 
another, where 0 represents a completely inelastic collision and 1 a completely elastic 
one. The friction u(0 < f < 1) describes the resistance between two touching rigid bod- 
ies with a relative velocity. For a collision behavior that matches the interaction with a 
soft surface, the bounciness is set to b = 0.8 and the friction is set to u = 0.5. 


Analyzing Natural Resting Aspects of Arbitrary ... 161 


3.2 Simulation Run and Cancellation Criterion 


For a representative distribution of the natural resting aspects of each component, the 
drop test is performed for a total number of n iterations. Each iteration i represents a 
dropping process of the component and results in a stable pose (natural resting aspect). 
To ensure that the component remains in a stable pose at the end of each iteration, a 
cancelation criterion is defined. Each iteration is divided into simulated frames f. The 
density of frames can be set by the number of frames per second. For each frame f the 
simulated position of the components center of gravity and its orientation is saved. If 
neither the position nor the orientation changes for several frames in a row, the part is in 
a stable final pose, the simulation of the current iteration i is finished, and a new iteration 
is started. This process continues until all n iterations are simulated. 


3.3 Export of Rotation Data 


The last simulated frame of each iteration of the physics simulation returns a quaternion 
ity which gives information about the orientation of the component and a location vec- 
tor oF; which determines its position in the inertial coordinate system (CS)ọ. A quater- 
nion is a hypercomplex number and is constructed as follows: 


oP; = a + bi + cj + dk with ,j,k? = —1 and a,b,c,d € R (1) 


The change of orientation represented by a quaternion is described by a rotation around 
ox; in R? with the angle ọ. The coefficient a represents the rotation angle with. 
a = cos(p), and b, c, d represent the coefficients of the rotation axis ox ;. 


0x 1 b 


= 


el a = (2) 
wis sin(cos~! (a)) d 


Each quaternion ave indicates the rotation of (CS), into (CS) y; (cf. Fig. 4). 


4 Data Evaluation and Identification of Natural Resting 
Positions 


The data exported from Blender, which provide information about the orientation and 
position of the individual final poses, are imported into a MATLAB framework and pro- 
cessed in the next step. To get a probability distribution of the final poses (natural resting 
aspects), the raw data are evaluated and sorted. The component pose can be precisely 
determined or assigned by specifying its orientation. However, there are several com- 
ponent poses representing the same natural resting aspect. To determine which poses 
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represent the same resting aspect, a classification feature based on the rotation data (qua- 
ternion) was worked out. It states that two poses can be assigned to the same resting 
aspect if the coordinate systems can be transformed into each other by a pure rotation 
around the Z-axis of the coordinate system (CS)9. Figure 4c shows two different com- 
ponent orientations Gus: avs), In both cases the X-axis and Y-axis are aligned with the 
ground surface which means the component rests on the same of its aspects. Both orien- 
tations therefore represent the same natural resting aspect. 

To classify the final component poses into natural resting aspects, all orientations 
oP; (with i = 1...n) are systematically compared with each other. In the following, two 
iterations of the drop test and thus two different end orientations are used to explain the 
classification algorithm. The first orientation is represented by the quaternion oY; the 
second one by the quaternion 9 Viet To compare the two orientations, oY; is multiplied 


with the complex conjugate of 9 v;,). The resulting quaternion ; z then describes the rota- 
tion of the first orientation (CS); into the second orientation (CS); +1: 


iZ =0Vi + 0Viy (3) 


According to Eq. (2), the axis of this rotation ee is extracted and then transformed 
into the inertial coordinate system (CS)p: 


Xiii = 0Ri iX iit (4) 

As already mentioned, two orientations represent the same pose if the rotation axis 
0X ;i+1 18 parallel to the Z-axis of (CS): 

0% = 001)" (5) 


Each quaternion of oV; is thus assigned to one of m stable component poses. The result 
is a probability distribution of the natural resting aspects. The results are automatically 
plotted in a pie chart with a corresponding figure of each stable resting aspect (Fig. 6). 
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Fig.5 Exemplary components (3D-printed, no infill) 
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5 Experimental Evaluation 


For the experimental evaluation, the results of the simulated drop tests are compared 
to the results of manual drop tests using five different 3D-printed exemplary compo- 
nents (Fig. 5). They vary in shape and are intended to cover a wide spectrum of possible 
components in reality. For both the simulated as well as the manual drop tests, each com- 
ponent is dropped on a soft surface 1000 times (n= 1000) from random initial orienta- 
tions. In the experimental setup, the soft surface consists of a 2 mm thick rubber mat 
adhered to a wooden board. 
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Fig.6 Comparison of simulated and experimental drop test results for five components 
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6 Results 


Figure 6 shows the results of the simulated and experimental drop tests. The results show 
that with two exceptions, all possible natural resting aspects of the components were 
identified. The exceptions occur for component 5, where the resting aspects 4 and 5 have 
a very small probability of occurrence. Both the simulation as well as the experimental 
drop tests each failed to identify one of the said resting aspects. The reason for this is 
that these resting aspects occur with a very low probability. Therefore, the number of 
drop tests (1000) may not be sufficient to reliably identify all natural resting aspects and 
could be increased in future works. 

The average deviation between the probability of occurrence of a resting aspect pre- 
dicted by the simulation model and the experimental results is 6.4%. However, the accu- 
racy varies depending on the considered component. Components 2, 3 and 5 have an 
average deviation of 3.2, 1.7 and 6.7% respectively, while the results of components 1 
and 4 deviate by 10.0 and 10.4% on average. The highest deviation occurs in component 
4 with a deviation of 31.2% for orientation 6 (Figs. 6 and 7). 

In their extensive experimental evaluation Udhayakumar etal. determined aver- 
age deviations of 11.1, 8.5 and 8.7% between the experimental results and the results 
returned by the CSA-, stability- and CRSA-method respectively [11]. They investigated 
eight different sector shaped components. The maximum deviation in [11] was 23.3%. 

The deviations between simulation and reality can have different reasons. As already 
mentioned, it can be assumed that a larger sample, i.e. iterations per component in the 
simulation as well as in the real drop test, can lead to a higher agreement of the data. 
However, the larger deviations (component 1: aspect 1 and 3 (cf. Fig. 1); component 4: 
aspect 3 and 6) cannot be explained by an insufficient sample size. It is assumed that 
differences of the bouncing and damping behavior of the components and the surface 
between the simulation and the real test setup are a major factor for these deviations. 
Furthermore, the influence of the drop height on the resting aspects was not taken into 
account and could also lead to significant deviations. 
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Fig. 7 Natural resting Aspects of Component 4 
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7 Conclusion and Outlook 


In this work, a novel method for the automated determination of natural resting aspects 
of arbitrary components by means of a physics engine was presented. The physics engine 
iteratively simulates a drop test and exports the resulting resting poses of the component. 
A MATLAB framework then compares all exported component orientations, automati- 
cally clusters them, and returns the probability distribution of the natural resting aspects. 
An experimental evaluation of the new method shows promising results with an average 
deviation of 6.4% between simulated and experimental results. Nevertheless, for some 
components, the deviation between simulated and experimental probability of particular 
resting aspects is higher. In order to increase the simulation accuracy, to extend the vali- 
dated component spectrum, and to include external influences, future work will focus on 
two aspects: 


1. Extensive studies based on DoE methods will be carried out with multiple, more com- 
plex components. Using the acquired data, the simulation model will be parameter- 
ized with regard to the surface bounciness and friction (cf. Sect. 3.1) to increase the 
determination accuracy. Furthermore, the studies will validate the flexibility of the 
novel method with regard to the component geometry. 

2. The simulation model will be modified to include the influence of a complex environ- 
ment (e.g. inclination, walls) on the resting aspects of a component. The modifica- 
tions will be validated on the concrete application of a newly developed, vision-based 
aerodynamic part feeding system [3]. 
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Abstract 


Robotic grasping of small metallic objects such as bolts is a challenging task due to 
the small dimensions and textureless reflective surfaces. Depth images acquired of 
such objects are often noisy and error-prone. In addition, overlapping of parts occur 
as they are provided randomly oriented in a box such as a small load carrier. To over- 
come the limitations of existing solutions for bolt separation, a flexible and cost- 
effective system is developed using an industrial robot and a magnetic gripper. In a 
two-stage procedure, the bolts are first grasped blindly from a box and placed on a flat 
surface. In the second step, object detection and pose estimation is performed and the 
individual bolts are grasped and inserted into a fixture, so that finally the bolts are in 
a defined position. Industrial use cases for this system are the automated preparation 
of bolts for robotic screwing processes or automated commissioning of small objects 
for assembly tasks. The methodology, implementation and evaluation of the proposed 
solution is presented in this paper. 
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1 Introduction 


The automation of assembly tasks provides benefits such as increasing productivity and 
quality, relieving employees of monotonous tasks and reducing costs. Due to their flex- 
ibility, industrial robots are applied for automation in assembly in different use cases 
like the automation of handling and screwing processes. The flexibility of the automa- 
tion system is particularly crucial with low quantities, a high number of variants or short 
product life cycles. 

To enable robotic assembly automation, the individual components usually have to 
be placed in defined poses, which is also required for bolts used for robotic screwing. 
Conventional systems such as step feeders or vibratory bowl feeders do not provide the 
necessary flexibility to handle different types of bolts and require additional space and 
investment. If, in contrast, the industrial robot designated for the automation task is used 
for handling of parts, system utilization can be increased and further investment costs 
can be saved. 

One use case considered is the robotic screwing of components, where the industrial 
robot can be used to grasp and separate the bolts. If the assembly process takes place in 
a partially automated production line with 1- or 2-shift operation, the preparation can be 
done by the automated system in the overnight shift. Another use case is the automated 
commissioning of parts to provide them in a defined number and position e.g. in shadow 
boards in order to reduce search times in manual assembly. 

The aim of this research is to develop a cost-effective solution for the use cases men- 
tioned above. The use of an industrial robot with a suitable end effector enables grasping 
of different objects and therefore offers high degree of variant flexibility. The metallic 
bolts used in these applications are characterized by small dimensions and a textureless, 
reflective surface. Those characteristics hamper the realization of a cost-effective and 
flexible solution and face existing bin-picking solutions with challenges. 

Therefore, a novel two-stage method for robotic bin picking of small magnetic objects 
is presented in this paper. The proposed system is characterized by its flexibility and the 
use of edge computing devices for object detection, pose estimation and motion plan- 
ning. Hereby, the system can be easily integrated into existing applications without the 
need of major modifications on the overall robotic cell. 

In the following, the corresponding state of the art is described in detail and the need 
of action is identified. The methodology, implementation and evaluation are presented 
subsequently. 


2 State of the Art 


In addition to the selection of a suitable gripper, a key requirement in the implementation 
of bin-picking solutions is the precise and robust estimation of a suitable grasping pose. 
There are already numerous solutions for determining the object pose or suitable grip- 
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ping positions, which are summarized, for example, in [1]. However, many of those solu- 
tions require high computation power and are often not suitable for small, textureless and 
symmetric objects as present in the specified use case. Therefore, the detailed state of the 
art regarding object recognition on edge devices as well as pose estimation and grasping 
of small, metallic objects is presented in the following. 

For object recognition in colour images, the state of the art offers a variety of solu- 
tions that can also be executed on low-power hardware. Widely used solutions include 
YOLO resp. the version optimized for mobile devices, Tiny-YOLO [2] or Pelee [3]. 

The segmentation of individual objects can also be performed robustly on low-power 
hardware. Solutions such as FuseNet, SegNet, or YolactEdge can run on edge devices like 
the NVIDIA Jetson TX2 or NVIDIA Jetson AGX Xavier, enabling semantic segmentation 
and, in the case of YolactEdge, instance segmentation at 30 FPS and above [4, 5]. 

Due to the dimensions and metallic surface of the bolts as well as their unordered 
positions e.g. in a load carrier, the automated bin picking of bolts is highly challeng- 
ing. The textureless, metallic surface of the bolts causes reflections and does not provide 
many distinct features which leads to significant noise in the data captured with common 
RGB-D cameras and inadequate point clouds of the objects. Thus, some approaches try 
to grasp and separate bolts or similar objects without the use of computer vision. 

Mathiesen et al. present a solution whereby a robot equipped with a scoop-shaped 
tool grabs the required parts from a box. Within the tool, an orienting groove is used to 
ensure that only objects with the desired orientation are kept in the scoop. Afterwards the 
oriented objects can be grasped from the scoop using a separate tool or gripper [6]. 

Ishige et al. also avoid the application of computer vision and use a gripper with two 
individually movable fingers and integrated tactile sensors for object grasping and sepa- 
ration instead. First, multiple objects are grasped from a box at once and the number of 
bolts between the fingers is counted using the tactile sensors. Then the gripper fingers are 
moved so that excess bolts fall out and finally only one bolt remains in the gripper [7]. 

Complementary, von Dirgalski et al. propose the combined use of computer vision 
and force sensors to determine the pose of an object between the gripper fingers [8]. 

This contrasts with methods using colour and depth data to determine the pose of 
individual bolts. Furukawa et al. use RGB-D data combined with a template matching 
approach to detect M6 bolts and subsequently grasp them using a two-finger gripper [9]. 
The solution presented by Nakano is based on machine learning instead and uses a single 
shot 6DoF pose estimator to determine the pose of a bolt before grasping it [10]. 

To circumvent the effects of erroneous depth information for reflective objects, Sato 
et al. propose a two-step process. In this process, multiple objects are grasped from a 
load carrier using a magnetic gripper and are placed on a flat surface. Subsequently 
the objects are classified using RGB information and are individually grasped with the 
magnetic gripper. Thereby the objects remain in an unknown pose and can thus only be 
sorted but not fitted into a fixture or mounted to other components [11]. 
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Another way of coping with noisy depth data is the 6DoF pose estimation pipeline 
for textureless, metallic objects presented by Blank et al. However, this solution is only 
suitable to a limited extent for small, bulk components, since the objects are too close to 
each other and overlap and thus cannot be clearly segmented [12]. 

While all of the presented solutions enable the separation of small, metallic objects, 
they still come with drawbacks. The approaches either require the design of an object- 
specific tool, provide the separated objects with an unknown pose, or are only suitable 
for objects above a certain size. At the same time, established solutions for separation 
and orientation such as vibratory bowl feeders or step feeders do not offer the necessary 
flexibility and are characterized by high space requirements and investment costs. Thus, 
a novel approach for the separation of small, metallic bolts is presented in the following. 


3 Two-Stage Bin Picking: System Design and Methodology 


The method enables to provide the bolts in a defined pose after the two-stage grasping 
process. At the same time, the approach also copes with noisy depth information and is 
characterized by its low investment costs and flexibility. 


3.1 Requirements and System Design 


The overall aim is the development of a flexible and cost-effective system that can be 
easily adapted to different objects or variants. The bolts to be separated have small 
dimensions with a total length of about 15 mm to 35 mm and diameters of 3 mm to 
5 mm at the cylindrical shaft and 10 mm to 15 mm at the head of the bolt. The metallic 
bolts are magnetic and have textureless, reflective surfaces. 

In addition, space requirements and the integration of suitable sensors have to be 
considered. Small and lightweight sensors for object recognition should be arranged at 
the robotic end effector, while fixed infrastructural sensor systems should be avoided. A 
cost-efficient solution is preferably selected. 

The proposed setup consists of an industrial robot with an appropriate end effector 
containing a magnetic gripper and a vision sensor. Thus, there is the restriction that only 
magnetic objects, such as metallic bolts, can be picked. A box e.g. a small load carrier 
containing the bolts in random poses is placed in the robot’s workspace. In addition, a 
fixture is used to store the objects in a defined position after grasping. 

Grasping the small metallic objects directly from the box is complex and challenging. 
Due to overlapping of the randomly oriented parts, difficulties arise in finding suitable 
grasping poses and high accuracy is required when grasping the objects. Furthermore, 
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reflections occur causing noisy or fautly depth measurements and difficulties in identify- 
ing the bolts. Therefore, a two-stage procedure is proposed. 


3.2 Two-Stage Procedure for Bin Picking with Magnetic Gripper 


Due to the challenges described above a two-stage procedure is presented for the bin 
picking task consisting of a blind i.e. visionless grasp into the box in the first stage and 
the grasping of individual bolts from the work surface in the second stage. The procedure 
is shown in Fig. 1. 

An image of the workspace is taken in a defined scan pose using the vision system 
attached to the industrial robot. The position of the camera is parallel to the work surface 
at a predefined height. If no bolt is detected, there is a blind grasp whereby the robot 
moves the magnetic gripper into the box without using visual information. Some bolts 
are grasped and afterwards placed on the work surface next to the box. 

If a bolt is detected in the image, its pose is estimated, the bolt is grasped and placed 
in the fixture. The process is repeated until the required number of bolts is reached resp. 
the fixture is completely equipped. 


© 
Image of the workspace 


Blind grasp into the box 
Placement on work 
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Fig. 1 Two-stage procedure for bin picking with magnetic gripper 


3.3 Grasping Process and Pose Estimation 


A custom made magnetic gripper consisting of an electromagnet with a microcontroller 
is used to grasp the bolts. Process knowledge is used to control the force of the magnetic 
gripper depending on the type of bolts (especially the weight) and the intended picking 
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process (first step or second step). When grasping blindly into the box in the first step, 
the force is strong enough to pick up several bolts and place them on the work surface. 
Subsequently, in order to grasp one bolt, the tip of the magnetic gripper is placed at the 
head of the bolt and an appropriate force is set to grasp exactly one bolt. 

Therefore, the pose of the bolt is determined using the implemented computer vision 
system. The bolts lying on the work surface have three degrees of freedom (DOF), two 
translational and one rotational DOF. Applying a previously trained convolutional neural 
network (CNN), the positions of the bolt as well as the bolts’ head are determined in the 
image. The center points of the bounding boxes of these two object classes are provided. 
In addition to the position of the bolt in x- and y-direction, the orientation of the bolt is 
determined using the two center points and the corresponding angle £, as depicted in 
Fig. 2. 


Fig. 2 Position of bolt lying bolt 
on the work surfaces a) side 
view and b) top view 


work surface 


a) b) 


4 Implementation and Evaluation 


To evaluate the described system, it is implemented as depicted in Fig. 3. A light- 
weight robot UR10 from Universal Robots is used to automate the process and an Intel 
RealSense L515 LIDAR is used to capture the environment (see Fig. 3a). The sensor 
uses the time-of-flight principle and provides a point cloud of the environment with a 
maximum resolution of 1024 x 768 spatial points. In addition, a colour image of the 
environment with a maximum resolution of 1920 x 1080 pixels is captured and super- 
imposed with the generated point cloud. This allows the pixel coordinates of the colour 
image to be converted to the corresponding spatial points. 
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Fig.3 a Implementation of the overall system, b identification of bolts using YOLOv3 and ec 
grasping position of an individual bolt 


The CNN YOLOv3 is used to identify the bolts. To train the CNN, 350 images were 
manually annotated. Thereby, the complete bolt as well as the head of the bolt are anno- 
tated individually in order to distinguish the respective parts during recognition. For 
training, the annotated image dataset is divided into the actual training data, the valida- 
tion data and the test data in a ratio of 80, 10 and 10%. The hyperparameters for the 
training were chosen as listed in Table 1. After completing the training, the model 
achieved a mean average precession of 95% on the test set. Figure 3b shows the identifi- 
cation of the bolts using YOLOv3 with bounding boxes of the complete bolt (purple) as 
well as the heads (light green). 


Table 1 Hyperparameters used for the training of YOLOv3 


Hyperparameter Value 

Batch size 64 

Subdivisons 16 

Iterations 6000 

Initial learning rate 0.001 

Steps and corresponding scales 4800 (scale: 0.1), 5400 (scale: 0.1) 


174 M. Herbert et al. 


The developed magnetic gripper consists of an electromagnet with a diameter of 
25 mm, a length of 20 mm and a maximal retention force of 50 N. The electromag- 
net is switched via a bridge circuit and the overall magnetic gripper is controlled via 
an Arduino Uno, whereby the magnetic field strength can be adjusted via a pulse-width 
modulated signal. An additional crash protection is installed between the gripper and the 
robot flange to avoid damage to the robot in the case of a faulty gripping attempt. 

The software required to achieve an automated grasping and separation process is 
implemented using the Robot Operation System (ROS). The darknet_ros package can 
be used for the integration of YOLO. The motion planning for the UR10 is done using 
the Movelt framework. The calculation of the gripping pose, the control of the magnetic 
gripper as well as the sequence control is implemented using the middleware provided 
by ROS. The whole software including object recognition, gripping pose calculation and 
motion planning is executed on an NVidia Jetson AGX Xavier. 

To evaluate the presented system multiple test runs are performed. During the test 
runs M5 cylinder head bolts are used as gripping objects. The objective of every test run 
is to place 16 bolts in the fixture. The process starts by grasping multiple bolts from the 
box and placing them on the work surface. The bolts are then grasped individually (see 
Fig. 3c) and inserted into the holes of the fixture. Once no more bolts are detected on the 
work surface, new bolts are grasped from the box. Each test run continues until the fix- 
ture is fully equipped or an error occurs. 

No errors or faults occurred during the runs when grasping the bolts out of the box. 
In every iteration, between two and five bolts were grasped from the box and placed on 
the work surface. In the subsequent process of grasping and placing the individual bolts, 
70% of the bolts were grasped successfully and 60% of the bolts could be deposited suc- 
cessfully in the fixture. 

The following issues caused unsuccessful placements in the fixture and failed grasp- 
ing attempts. An unsuccessful placement of a bolt was always connected to a faulty pose 
estimation resp. a faulty grasping attempt. When the bolt is not centred beneath the grip- 
per or tilted after grasping it from the work surface, it cannot be placed in the fixture cor- 
rectly. Since currently no optical verification of the correct grasping pose is integrated, a 
wrongly oriented or tilted bolt is placed next to a hole when it is inserted into the fixture. 
This can also result in an unacceptably high compression force and an emergency stop of 
the robot. 

Unsuccessful or faulty grasping attempts are mainly caused by bolts lying close 
together. In this cases, the bolt and its head cannot be recognized unambiguously and 
thus, no or an invalid grasping pose is calculated. When no bolt was grasped from the 
work surface, the process continues, but the space in the holder remains empty after the 
placement is completed. 

Another error was residual magnetization of the bolts after insertion into the fixture, 
which prevented it from being released from the gripper. However, this error can be reli- 
ably prevented by moving the gripper away at an angle after insertion. 
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5 Summary and Outlook 


The system presented in this paper enables the flexible and cost-effective separation of 
bolts using an industrial robot and a magnetic gripper. After separation, the bolts are in 
a known pose, allowing them to be inserted directly into a custom fixture. As the depth 
data of the small and reflective bolts is noisy and error-prone, a two-stage process is used 
to separate the bolts. First multiple bolts are grasped from the box and are placed on 
an even surface. Afterwards, the object detection and pose estimation is performed to 
grasp a single bolt in a defined manner. The presented implementation and evaluation 
demonstrates the functionality and potential of the system. Further test runs have to be 
performed with different bolt variants. 

In order to address the errors encountered during the evaluation, the following 
improvements and enhancements will be made in the next development step. An addi- 
tional colour camera will be integrated to check if a bolt has been successfully grasped 
from the surface and whether it is in the required pose. If not, the bolt is put down again 
and the grasping process is repeated. Furthermore, after placing the bolt in the fixture, 
the colour camera of the LIDAR should be used to check that the bolt has been inserted 
correctly. 

Moreover, a combined force and position control will be used while inserting a bolt in 
the holes of the fixture to compensate for small deviations in the placement position. As 
shown by Metzner et al., the application of a suitable compensation strategy can signifi- 
cantly increase the success rate when inserting objects into holes [13]. 

Finally, after integrating the improvements mentioned above, an evaluation of the 
overall process with different bolt types will be carried out and the success rate when 
loading different fixtures will be evaluated. 
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Abstract 


Improving the dynamic path accuracy has been a major research topic in industrial 
robotics for decades. It is known that the drivetrains installed in the robot joints limit 
further improvements. There is a lot more literature on the dynamic behavior of harmonic 
drives (HDs) than for cycloidal drives (CDs), that are usually installed in industrial robots 
(IRs) with heavy payload. However, a more profound knowledge of the occurring effects 
offers the potential for both, design- and control-based enhancements. Therefore, this 
paper presents an experimental study of the friction and hysteresis behavior with explicit 
consideration of further dependencies, such as temperature and load. Based on these 
investigations, a model as well as a control-based compensation approach, that does not 
require additional gearbox output sensors, is proposed. The investigation and validation 
are carried out with an experimental setup equivalent to the drivetrain of an IR with heavy 
payload. 
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1 Introduction 


Since several decades, it is known that the dynamic behavior of the drivetrains installed in 
the robot joints are limiting the obtainable path accuracy [1, 2]. These drivetrains, usually 
consisting of a precision gearbox and a permanent magnet synchronous machine (PMSM), 
exhibit a variety of nonlinear effects such as torque ripple, kinematic error, friction and 
hysteresis. It is also known that as precision gearboxes, CDs are installed for IRs with heavy 
payloads instead of HDs, which are commonly used for lightweight robots. The reason for 
this is the overload capability of CDs due to the operating principle with rolling contact 
instead of tooth meshing [3]. 

Previous studies [4-14] have focused almost entirely on HDs. From these investigations 
itis known that the friction has a dependence on additional quantities, such as temperature or 
load. However, to the authors best knowledge, no study has yet been published on a potential 
dependence of the hysteresis on additional quantities. Therefore, this paper addresses the 
knowledge transfer from HDs to CDs by presenting an experimental investigation of addi- 
tional dependencies of the friction and hysteresis behavior of CDs. Furthermore, a modeling 
of these dependencies as well as a control-based compensation approach is proposed. 


2 Related Work 


Since the 1990s, research efforts have been made to model the hysteresis behavior of HDs, 
which is caused by friction and nonlinear stiffness. Early dynamic models were proposed by 
Seyfferth et al. [4], Taghirad and Bélanger [5] as well as Dhaouadi et al. [6], among others. As 
recent work on this topic, the studies of Tjahjowidodo et al. [15] and Ruderman et al. [7, 8] 
are noteworthy. Tjahjowidodo et al. [15] use parallel Maxwell-slip elements to describe the 
nonlinear dynamics, whereas Ruderman and Iwasaki [8] adopt a rate-independent Bouc-Wen 
hysteresis model. In addition to the modeling, Ruderman and Iwasaki propose a sensorless 
hysteresis compensation approach based on a generalized momentum observer [16] and a 
Stribeck friction model. To the authors best knowledge, only Dhaouadi et al. [6] investigated 
a possible multidimensionality of the hysteresis of HDs. Thereby, hysteresis curves for 
different frequencies were determined, and no additional dependence was found. 

In contrast, there is significantly more work exploring the friction behavior. Bittencourt 
et al. [9] investigated the load and temperature dependence exemplarily for the second 
joint of an ABB IRB 6620. They detected an independence between temperature and load, 
and based on this, they suggested an empirical model. In contrast, in [10] no tempera- 
ture, but an additional position dependence was considered using lookup tables. Carlson et 
al. [11] model the temperature dependence of both, an ABB IRB140 and an ABB YuMi, 
using a temperature-dependent Coulomb friction adaptation based on the estimated thermal 
energy stored in the robot joints. Simoni et al. [12] also studied the temperature dependence 
of the friction in the assembled state. Considering the second joint of a Comau SMART 
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NS-16-1.65, two different modeling approaches based on a polynomial friction model were 
proposed. On the one hand, a model with a linear temperature dependence of the entire 
friction model. On the other hand, a model where each parameter exhibits an individual but 
linear dependence. Madsen et al. [13] consider the temperature dependence of a Universal 
Robot URSe using an additive, polynominal friction term. Whereby the authors note that 
their approach does not extrapolate well and thus may lead to problems in practical appli- 
cations. In addition, the load dependence is taken into account using an adaptation of the 
Coulomb friction coefficient with respect to the squared load torque. Another approach is 
the approximation of the temperature dependence of the friction using a neural network [14]. 
The validation is carried out on a testbench with a single joint of the DLRs Humanoid robot 
David with position, torque, and temperature sensors. 

An experimental investigation of the hysteresis behavior of CDs, which is closely linked 
to the friction behavior, has not yet been published, unless the previous work [17]. In this 
work, we proposed a Bouc-Wen as well as a nonlinear auto-regressive with exogenous inputs 
(NARX) model to represent the hysteresis behavior of CDs. However, a possible temperature 
or frequency dependence of the hysteresis behavior was not investigated. In addition, the 
models were not validated using a compensation scheme. 


3 Experimental Setup 


All subsequent investigations of the friction and hysteresis behavior of CDs are carried 
out on the experimental setup shown in Fig. 1, which simulates a robot joint of the heavy 
payload class with one degree-of-freedom. The CDs under test is the precision gearbox 
RH380-N from Nabtesco © with a rated torque of 3.7kNm and a gear ratio u of 185. The 
lubricant temperature of the CD is measured using a PT100 sensor ©. The CD is driven by 
the PMSM MSKO070D from Bosch Rexroth ®, which is equipped with a 13 Bit encoder. The 
joint torque Tg is measured with the torque sensor T40B of HBM ©. Via the water-cooled 
high-torque motor DST2-315KO of Baumiiller © a dynamic load torque can be applied. 
Thus, the load motor © is connected with the output-side of the CD © using a Roba DS 
1400 double-jointed coupling of Mayr ® with a torsional stiffness of 15e6 Nm/rad. 

The experimental setup is operated with a rapid prototyping platform of Speedgoat, which 
executes a Simulink model. The rapid prototyping platform communicates with industrial 
motion controllers of Bosch Rexroth and Baumiiller, on which the current control of the 
motors run, through EtherCAT with a 500 us cycle time. 
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Fig. 1 Experimental setup used for the investigation (adapted from [17]) 


4 Investigation 


From the related work (see Sect.2) it is known that the friction has dependencies on the 
load and temperature in addition to the velocity. However, for the hysteresis, there is no 
study that examined additional dependencies. Therefore, in this section, the friction and a 
potential multidimensionality of hysteresis behavior of CDs is experimentally investigated. 


4.1 Friction 


Classical, static friction models describe a functional relationship between the friction 
torque tr and the relative velocity of the contact surfaces. Assuming only a dependence 
on the motor velocity Ê, as is often the case in industrial robotics, it is possible to identify 
the friction behavior by closed-loop motion trajectories with constant velocity. We assume 
that the effects of temperature and load are independent, which significantly reduces the 
investigation burden. The validity of this assumption was already shown in [9]. 

To investigate an additional dependence on the load torque, this experiment was repeated 
several times, while a constant load torque Text was applied using the output motor. Figure 2 
shows the results for load torques of 0 to 3kNm as well as for negative and positive loads. 
It is obvious that with increasing load torque, the friction torque increases, too. This can 
be explained by the fact that an increased load leads to an increase in the contact surface, 
which in turn results in a higher friction torque. However, this relationship between the load 
torque and the friction torque is nonlinear as well as dependent on the direction of rotation. 

To investigate the temperature dependence, a constant motor velocity of 150rad/s is 
used to heat up the joint. Once the temperature is reached, the experiment is carried out. 
About 30 min were required to heat from 20 to 50°C. The corresponding friction curves 
are shown in Fig.3. The rising temperature leads to an increase in the static and Coulomb 
friction, whereas the viscous friction decreases. With increasing temperature, the viscosity 
of lubricants decreases, which explains the decrease in viscous friction. Simultaneously, the 
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Fig.3 Friction torque te in dependence of temperature T (in steps of 5K) 


increase in temperature leads to an expansion of the material, which increases the contact 
surface and thus may explain the increase in static and Coulomb friction. 


4.2 Hysteresis 


The hysteresis behavior of robot joints is typically modeled as a nonlinear differential equa- 
tion of the joint torque depending on the joint torsion as well as its derivative. Other potential 
dependencies such as on frequency or temperature were not examined to the authors best 
knowledge. The investigation of these dependencies is performed by applying a sine signal 
of the load torque Text with an amplitude of 3 kNm, while varying the additional quantities. 
Subsequently, a static hysteresis curve is obtained in each case by plotting the joint torque Tg 
against the torsion angle @. 

Therefore, the frequency of the sine signal of the load torque is altered between 0.125 Hz 
and 2Hz. The resulting static hysteresis curves, which are nearly identical, are shown in 
Fig. 4a. This corresponds to a frequency independence, which is beneficial since more sim- 
ple, rate-independent hysteresis models are sufficient. The procedure to heat up the robot 
joint corresponds to that of the investigation of the friction of Sect.4.1. The obtained static 
hysteresis curve for the temperatures 20, 35 and 50°C are shown in Fig. 4b. It is noted that a 
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Fig.4 Friction torque tf in dependence of load torque Text frequency and temperature T 


frequency of the sine signal of 0.5 Hz was chosen, however, this is irrelevant due to frequency 
independence. An increasing stiffness with rising temperature is noticeable, although the 
basic shape of the hysteresis curve does not change significantly. This stiffness increase may 
be explained by a temperature-dependent material expansion. 


5 Compensation Method 


The control-based hysteresis compensation of robot joint is an approach to meet the fur- 
ther increasing accuracy requirements in industrial robotics. In this case, cost-effective 
approaches that do not require additional gearbox output sensors are advantageous. In the 
following, we first propose a model based on the investigation above. Thereafter, we present 
a compensation approach without gearbox output sensors and validate it on the experimental 
setup. 


5.1 Modeling 


The proposed model originates from the flexible joint model according to Spong [2]. How- 
ever, this model is supplemented by a temperature-dependent hysteresis spring as well as a 
velocity-, load-, and temperature-dependent friction. This leads to the dynamics of a single 
robot joint 


=M [e+ te], = Ja" [tm u'r], a) 


with the motor @ and joint position q, the gear ratio u, the joint-side inertia M, the motor 
inertia Jm, the motor Tm, external load Text, friction tf and joint torque Tg. 

For the temperature-dependent hysteresis spring we adopt a Bouc-Wen model based on 
our previous work [17]. This model, which is rate-independent, notes as follows: 
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tg = wko+(l—wy)kx, &=d-B|ö| xx -yolx", (2) 


with the nonlinear, temperature-dependent stiffness 


k(T) = ko(T) + kı(T)|E| + ka(T)IO/3, (3) 


the torsion ¢ = u~!6 — q, the weighting factor 0 < w < 1, the internal state x, the shape 
parameters y, 6, n and the temperature T. Due to the nonlinear behavior of the model, 
the identification is performed using the particle swarm optimization. In addition to our 
previous work [17], a temperature-dependent stiffness (3) is considered to account for the 
observed temperature behavior. Therefore, the identification procedure is repeated at 35 
and 50°C, whereas only the stiffness parameters are included as free model parameters. 
Subsequently, a second-order temperature-dependent polynomial is fitted separately for 
each stiffness parameter ko, k1, k2 by minimizing the mean squared error (MSE) using the 
temperature-independent parameter estimates. 

To account for the load and temperature dependence of the friction, we assume, follow- 
ing [9], that the temperature and load friction effects 


t(6, T, Tg) = t,1(6, T) + TEI, Tg) (4) 


are independent. For the load-dependent friction tz) we apply a 2-D lookup table as proposed 
in [10]. Regarding the temperature-dependent friction tf r, we adopt a LuGre model [18] 


tur = ooz + o1exp (-Öfu) 2+ FT, = 6 — oo (lO l/e)) z, (5) 


with the temperature dependent Stribeck curve 
. r ô 
80) = F(T) + (F(T) — F. (T))exp (- Yun] ) , (6) 


the Coulomb Fp, viscous Fy and static F, friction coefficients, the bristle stiffness og and 
damping 01, the shaping factor ô and the Stribeck velocity vs. The identification is done in a 
two-step process. First the static friction parameters (Fo, Fy, Fs, 6, vs) are obtained using the 
Levenberg-Marquardt algorithm to minimize the MSE between a classical Stribeck model 
and the measurement (cf. Fig. 3) at each temperature. Secondly, for each of the parameters 
separately, a second order temperature-dependent polynomial is fitted in the same way as 
for the hysteresis behavior. Subsequently, the dynamic parameters oo, 01, vg are identified 
employing a particle swarm optimization. 


5.2 Compensation Scheme 


The proposed compensation scheme, which is shown in Fig.5, is adapted from [8, 19]. The 
compensation is based on an inversion of the hysteresis, requiring the joint torque, which 


184 P.Mesmer et al. 


Fig.5 Control scheme of robot joint 
hysteresis compensation 
approach 
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is not measured. Instead of a sensor, a so-called generalized momentum observer, which is 
known from collision detection of robots [16], is utilized. The starting point for the derivation 
is the generalized momentum p = Jm - 9. The observer yields by taking the time derivative 


p=JIm-6=tm—te—u ‘tg (7) 


r= Ko] f bar- p| = Ko| [m 0dr p] (8) 


of the generalized momentum. This residual equals the estimate of the joint torque, which 
becomes obvious by taking its time derivative 


and the residual 


À = Koltm—te—r — p] = Ko [>r +u te] (9) 
and transforming it into the Laplace domain 


i =i — i Ko = e 
ja? Te a. /s+Ko = 1 (10) 


Subsequently, the hysteresis behavior (2) is inverted to obtain the estimated joint torsion 
È = VYwk- [ur — (1 — w)kx]. (11) 


Finally, the estimated torsion is added to the desired joint position qq. 

It is known from [19], that residual oscillations may occur using this compensation 
scheme. To avoid this effect, a dead zone of the position error is included, which matches 
the noise of the estimated joint torsion & at standstill. 

Moreover, the compensation scheme needs an estimate of the friction. In addition to 
[8, 19] we apply a LuGre observer instead of a static Stribeck model. The observer 
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TET = 0024 exp ( fa) 2 } F,ĝ (12a) 


È = Ê — o0 (181/26) 2 + kt (tm — ImOa — r — (Fr + te), (12b) 


with the observer gain kr, is adapted from [20] and extended by the previously modeled 
temperature and load dependence. Due to an insufficient sensor resolution, we apply the 
desired 64 instead of the measured motor acceleration 6. 


5.3 Experimental Validation 


The experimental validation is performed on the test bench of Fig. 1 by applying a point- 
to-point trajectory of the desired joint position gq with a trapezoidal acceleration profile. 
Simultaneously, a sinusoidal load torque Text with an amplitude of 3 kNm and a frequency of 
1/16 Hz, imitating a gravity induced force, is set. The experiment is conducted at a gearbox 
lubricant temperature of T = 20 and 35°C. Figure 6 shows the desired joint position gq, load 
torque Text and tracking errors eg = qq — q of the experiment. The presented tracking errors 
correspond to the scenarios without compensation eq, with compensation es according to [8] 
and the proposed compensation scheme ee. With the compensation according to [8], the £1 
norm of the tracking error at T = 20°C is reduced by 81 % from 13.8 mrad to 2.64 mrad. 
However, oscillations are evident at standstill. Using the proposed compensation, the £1 
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Fig. 6 Tracking experiment without (black), with compensation according to [8] (blue) and the 
proposed compensation (yellow) at T = 20 and 35°C 
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norm is reduced by another 56% to 1.17 mrad as well as the oscillations are avoided. At 
the temperature of T = 35 °C, the tracking error is reduced to a larger extent regarding the 
compensation according to [8] due to the modeled temperature dependence. 


6 Conclusions 


In this paper, an experimental investigation of the friction and hysteresis behavior of cycloidal 
drives was presented. The investigation revealed a significant load and temperature depen- 
dence of the friction. However, the hysteresis is rate-independent, and there is a low, 
temperature-dependent increase in stiffness. Therefore, the results indicate a great simi- 
larity between HDs and CDs regarding the friction and hysteresis behavior. Moreover, a 
compensation approach with an extended friction model was proposed, which improves the 
trajectory tracking performance compared to the state of the art. 

In the future the temperature sensor may be replaced by an observer. In addition the 
approach should be validated on a six degree-of-freedom manipulator in a practical appli- 
cation such as milling. 
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Abstract 


A concept of how load imposed by an exoskeleton on the upper arm affects shoulder torque 
is given using a mechanical mock-up of the shoulder-arm-system and a serial kinematic 
robot. System identification methods for linear surrogate models of the human shoulder- 
arm-system and their embeddings in control loops are introduced. Early measurements 
of a novel, multisensor LiDAR system for real-time motion-capturing of human motion 
are presented, and its implications discussed. The experimental setup is used for direct 
shoulder torque readings and control. 
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(1) (2) 6) 


Fig.1 Different layers of simulation. (/): Base case of a test person (a) wearing an exoskeleton (c). 
(2): The wearable is replaced and simulated by a collaborative, serial kinematic robot (b). (3): The 
shoulder-arm-system is mimicked by a servo drive with double pendulum attached (d) 


1 Introduction 


Exoskeletons are wearable devices supporting tasks commonly found in industrial appli- 
cations [1]. They modify the wearer’s internal load distribution by means of active [2] or 
passive [5] elements, or by a combination of both [7]. Aiming for the defined reduction of 
joint torques and forces, the question arises how to reliably determine these quantities. Joint 
forces and torque are defined by the sum of the individual muscle forces and their levers 
acted upon segments and cannot be measured in vivo. 

Our proposed method to determine and manipulate internal load distributions with the aid 
of exoskeletons is to replace the human wearing an exoskeleton with a mechanical mock- 
up of a simplified shoulder-arm-system, restricted to planar movement. Additionally, the 
affect of the exoskeleton is simulated with a collaborative robot, mechanically coupled to 
the upper arm of the analog simulator, imposing forces for establishing a control goal, like 
constant shoulder torque over time while carrying out a pre-defined task, see also Fig. 1. The 
advantage of this approach is to enable for direct readings of the torque and forces acting on 
the shoulders via appropriate sensors, and by measuring motor currents. The cobot, used as a 
substitute for the actual exoskeleton, allows for distinguished control inputs, thus simulating 
support of the exoskeleton. 


2 Related Work 


Lower-dimensional surrogate models for the prediction and feature extraction of human 
motion data is an actively researched area where principal component analysis, neural net- 
works, and statistical methods are among the most popular [6, 8, 9]. Gaussian process latent 
variable models (GPLVM), also considered as probabilistic nonlinear PCA, has been used 
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by Marin [13] to create a low-dimensional surrogate model embedded in an optimization 
problem to minimize ergonomic scores of drilling tasks. DMD-based methods have not yet 
established in analyzing and predicting human motion. The work of Enes [12] isolates the 
reason for this, and introduces delay-embedded DMD algorithm to remedy issues associated 
with the drawbacks of exact DMD. Patil [4] fused LiDAR and inertial measurement unit 
(IMU) sensor data to track human motion data in real-time. In [25], a motion-controlled 
mechanical mock-up of the shoulder joint is introduced, exhibiting a rotational degree of 
freedom of the scapula. 


3 Mechanical Mock-up 


The mechanical mock-up of the shoulder-arm system comprises of a gearless servo drive, a 
double pendulum attached to its shaft, and sensors to account for force and angular readings. 
Upper arm and forearm are made of milled aluminum parts, and reflect mass and dimensions 
of its human counterpart. At hand position, additional mass may be mounted for different 
load scenarios. Rotary encoders for absolute angular measurements are integrated into the 
servo drive, and mounted to the (elbow) joint connecting upper arm and forearm. The muscles 
are modeled as McKibben fluidic muscles, i.e. fiber-reinforced elastomers contracting when 
pneumatically pressurized [17]. The muscles’ insertion points are at 50 mm from elbow joint 
center for the biceps, and 25mm from the elbow joint center for the triceps. Table 1 lists 
used components and its specifications. For the mechanical simulation of the impact of an 
exoskeleton, a collaborative serial kinematic robot is used. It introduces pressure force via 
a link to the shoulder-arm-system teststand (Fig. 2). 


‘ 


Table 1 Components of the 


: Component Description 
mechanical mock-up 


Servo drive Kollmorgen C062C gearless 
cartridge motor 


Force sensor Kistler 3-axis 9067C Piézo force 
sensor 


Rotary encoder | Baumer absolute rotary encoder 


optoTurn EAL580 
Fluidic muscle | Festo DMSP-10-250N-RM-CM 
Controller Bachmann MC220 with AIO, 
and EtherCAT 
Upper arm Aluminum milled part (3kg) 
Forearm Aluminum milled part (2kg) 
Additional Steel (0..3 kg) 


mass 
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Fig.2 Mechanical mock-up of 
the shoulder-arm-system 
comprising of servo drive (a), 
upper arm (b), forearm (c), 
biceps (f), triceps (g), rotary 
encoder elbow joint (e), force 
sensor mounted in base plate 
(h), and additional mass at 
hand position (d) 


4 Model 


The mathematical model that describes the behavior of the planar shoulder-arm-system is a 
Langrangian of Ist kind description of a double pendulum with lumped masses, as depicted 
in Fig. 3. 
The governing equations are [22]: 
nF Ne 


.. ð > 
miki =% Fj +) da h, i = 1..2N, 0) 
j a 


where m denotes mass, x is a cartesian coordinate, F is applied force (gravitation, actuation, 
damping, support). nr is the number of forces, n. is the number of holonomic constraints, 
à is Lagrange multiplier, f is holonomic constraint, N is the number of mass points. 


Fig.3 Double pendulum schematic and characteristics for modeling according to Lagrangian of 1°’ 


kind. (x1, z1) is position of lumped upper arm mass, (x2, z2) is position of lumped fore arm mass, 
Y1, 2 are respective angles enclosed with the z axis. Mı is torque introduced by shoulder servo 
drive, Ma is torque introduced by biceps/triceps pair about elbow joint, Mq, Maga are respective 
damping torques, proportional to angular frequency. Fsup denotes the support vector imposed by 
the exoskeleton, g is gravitational acceleration, and m1, m2, l1, l2 are lumped masses and length of 
limbs, respectively 
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For the integration of the differential equations we are using an implicit Runge-Kutta 
method which has proven to be numerically more stable than explicit schemes. The chosen 
parameters are mı = 3kg, m2 = 2kg, Iı = h = 0.3m, g = 10m/s?, dı = dh = 0.4. 

The model is used for testing control and identification algorithms before deploying the 
code on the test stand, and to get a qualitative understanding of the underlying dynamics 
and characteristics of the system. 


5 Surrogate Model 


Surrogate models are small scale approximations of full-scale descriptions of system dynam- 
ics. Their main purpose is to adequately estimate and predict the motion in phase space, 
usually in a given subset of possible states, limiting the application range and accuracy of 
the surrogate model. 

In this article, we advocate the use of linear regression techniques, particularly the Hankel 
Alternative View of Koopman (HAVOK) [18], for two major reasons. Firstly, it preserves the 
physical meaning of the states, rendering the computational overhead of an observer obso- 
lete. Secondly, the obtained linear discrete time model integrates very well into the model 
predictive controller framework. Due to the linearity of the surrogate model it is computa- 
tionally feasible, and embeddable [16], even for optimization-based control strategies, as 
MPC is. 

The HAVOK method for deriving linear surrogate models of nonlinear systems on basis 
of measurement data is, in its foundations, a time-delay embedding with a Koopman-theory- 
motivated linear propagation of singular right eigenvectors over discrete time, closely related 
to the Eigensystem realization algorithm (ERA) [20], or the more recent dynamic mode 
decomposition with delay (DMDd) [3] (Fig. 4). 


6 Control 


Figure 5 shows the schematic of how to arrive at a linear surrogate model-based controller 
of the identified i/o behavior of an exoskeleton’s support vector to shoulder torque. The 
procedure is divided into an open loop and a closed loop branch. The open loop is really 
about system identification. While doing a trajectory-tracking controlled predefined task, 
i.e. keeping the hand position of the mechanical mock-up on a motion path, we impose force 
perturbations to the upper arm, and read the resulting shoulder torque. This input/output map- 
ping will subsequently be used for a linear-regression-based method to create a small linear 
surrogate model suitable for real-time control. For the closed loop branch, we have chosen a 
model predictive control (MPC) strategy as it seamlessly integrates the discrete-time linear 
model description obtained from the system identification part. Despite its optimization- 
based nature, and therefore computationally expensive, it is still applicable for real-time 
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Dir = EDR dominant dynamic modes 


Fig.4 The method Hankel Alternativ View of Koopman (HAVOK) [18] for creating linear surrogate 
models from measurements of nonlinear systems. (a) Time-shifted measurements are stacked into a 
Hankel matrix H , and decomposed into its left singular eigenvectors U, right singular eigenvectors V, 
and singular values S. (b) Only the first r right singular eigenvectors V, corresponding to the largest 
singular values, are stored, the remaining vectors are discarded. (c) Dynamic mode decomposition, a 
linear regression technique, is applied to truncated versions of V, denoted X and X’ (d) The best-fit 
matrix & propagates the right eigenvector vz one time step. (e) From the singular value decomposition 
of the Hankel matrix we have v = S~!U7. (f) The closed-form solution for the propagation of 
physical states in a time window of length r can be explicitely stated as a linear mapping of the 
truncated versions of U, S, and the best-fit matrix 3 


open loop 
closed loop 


motion pattern 
UK 
2 Ar = PHY e 
min E (x — Xrer) r =E, UHWE, 
WIE Kr = f (Xk Uk) surrogate model 
model predictive control VORREN 
(A,B) 


Fig.5 Cascaded strategy of cobot trajectory control: In an open loop system identification process 
(blue), the plant follows a given, periodic motion pattern. This movement is perturbed by force signals 
imposed on the plant, and the resulting shoulder torque is read. From the i/o data, a surrogate model 
is derived, using linear regression techniques, like Hankel Alternative View of Koopman (HAVOK), 
Eigensystem Realization Algorithm (ERA), and Subspace Identification (SSI). This model (A, B) 
then forms the basis for a model predictive control algorithm to close the loop of measured state xx, 
and computed input uz 
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control due to the linear description of the model, and performant algorithms optimized for 
embedded systems [16]. In contrast to frequency-domain methods, MPC control goals can 
be formulated explicitely as cost functions and state constraints on physical values. Addition- 
ally, the discrete time setting aligns well with cycle times used in threads of programmable 
logic controllers. 


7 LiDAR Sensors 


For measuring the planar movement of the shoulder-arm-system, actuated by a serial kine- 
matics robot, a LiDAR multi-sensor system, specifically developed for the task of tracking 
human motion, is applied. It basically consists of eight Intel RealSense L515 time-of-flight 
sensors 30 Hz frame rate, a depth accuracy of approx. 5 mm, and an integrated RGB camera 
for color information. The sensors are spatially distributed to capture the scene from differ- 
ent angles with their individual point clouds registered into an integrated scan based on an 
extrinsic calibration in a postprocessing step. Wiring and components are depicted in Fig. 6. 
Challenging tasks are sensor placement for a trade-off between minimizing occlusion effects 
due to shadowing, and minimization of interference between individual sensors as a side 
effect of their active measurement principle. To account for the interference, the sensors are 
triggered with temporal delays. The main advantage, and inherent characteristic, of a LIDAR 
measurement system is its ability to collect surface information of the captured object, and 
therefore contribute greatly to the classification, and identification of movement patterns. 


trigger 33V 
module 
_ Sp YYY 
CPUs for post processing eede 
and system control 


è USB-C 


— 5 
trigger strategy for minimizing interference 
registered 
> point 
cloud 


Fig.6 Wiring and components of LIDAR multi-sensor system 
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8 Results and Discussion 


The double pendulum system described in Sect.4 was stabilized with an LQR controller 
by linearizing about an operating point with torques Mj = —1.2Nm, M2 = 2.3Nm, 
representing the lower hand position of the trajectory of the task of picking up workpiece, and 
mounting it overhead. Figure 7 shows the damping effect of the controller when opposed to 
inputs introduced by the supporting structure. The input signal is a normalized measurement 
of an XSENSOR pressure mat, located at the load introduction area of the exoskeleton’s arm 
shell, recorded over a full motion path when carrying out the task of picking up workpiece, 
and mounting it overhead, and integrated over the area [24]. The controlled shoulder-arm 
complex serves as a model for the real behavior of a human arm when exposed to external 
disturbances. 

We were planning to apply the HAVOK with control (HAVOKc) method, described 
in Sect.5, to create a linear surrogate model for mapping the exoskeleton force input to 
shoulder torque, but up until now we were not successful implementing it. Python code for 
model, input data, controller, and attempt for HAVOKc are available at https://bitbucket. 
org/maxherrmann/havok. 

For the LiDAR system, the current state of development allows for capturing point clouds 
with four sensors measuring simultaneously. Figure 8 shows a sequence of images taken of 
a person taking a seat in a chair. The accuracy of the system has not yet been evaluated. 
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Fig.7 Simulation of forearm x position and phase plot of forearm (x, z) position. Blue represents free 
dynamics of the forearm when opposed to small signal support inputs, orange shows the oscillation- 
attenuated forearm dynamics using LQR control 
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(a) (b) (c) 


Fig. 8 Point cloud sequence of person taking a seat recorded by four mutually registered LIDAR 
sensors 


9 Summary and Outlook 


A mechanical twin of the human shoulder-arm-system coupled to a serial kinematic roboter 
introducing pressure force into the upper arm to support lifting set out to answer the question 
“Can shoulder torque of a mechanical mock-up be controlled with an appropriately chosen 
support vector over time based on data-driven linear surrogate models and a LiDAR motion 
capture system?” . The mechanical twin is a cybernetic arm, an analog simulator, equipped 
with rotatory and translatory actuators, and designed with equal dimensions and mass dis- 
tribution of a human arm, mimicking its motion. Thus, enabling for the real-time reading of 
torque in the shoulder and elbow joints, and reaction forces sensed with distinguished force 
sensors integrated in the fixed bearings of the motor. The collaborative robot simulates the 
impact of the exoskeleton on the upper arm via a mechanical coupling for pressure force 
transduction. 

A concept of how a HAVOKc-based system identification can be carried out while the 
simulated shoulder-arm-system is moving on a trajectory-controlled periodic path is out- 
lined. The resulting transfer function from introduced load at the upper arm to shoulder 
torque is obtained as a linear surrogate model. The surrogate model evaluates faster than the 
full model while preserving the dominant characteristics, and can thus be incorporated into 
a trajectory-tracking controller. 

Simulations were carried out validating the mechanical model, and testing the perfor- 
mance of a linear-quadratic controller that stabilizes the hand position. For small signal 
perturbations imposed by a load vector acting upon the forearm, the resulting oscillations 
observed at the uncontrolled arm were effecitvely attenuated. 

We are introducing a novel multi-sensor LIDAR system, merging individual sensor mea- 
surements into an integral point cloud by mutually registering the data sets. In a subsequent 
post processing step, features, i.e. segment positions and orientations, are extracted and used 
as reference signals for downstream control loops. 

All the mentioned teststands, sensors, and algorithms are still in the development phase 
such that this article sketches an outline and a concept of the investigations to come. 
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In future research, the mechanical mock-up can be replaced by a mapping from kinematics 
to kinetics. This is usually accomplished by introducing human motion data to a musculo- 
skeletal model, and, by means of inverse kinematics and inverse dynamics, compute the 
internal load state of a human [10, 11]. Since this approach is computationally expensive 
and infeasible for real-time control, surrogate models might as well be a suitable measure 
for addressing this problem. 
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Abstract 


In recent years, the number of industrial exoskeletons has significantly increased. As 
a large share of assembly tasks still requires the execution of manual work, exoskele- 
tons may help provide support to users and, thus, reduce physical strain on the human 
musculoskeletal system. However, exoskeletons still lack empirical evidence on their 
potential relieving effects on the human body and are, thus, not widely deployed in 
industrial applications yet. To investigate on exoskeleton’s impacts and promote their 
future adoption in the industry, industrial settings are increasingly modeled as differ- 
ent test scenarios in a laboratory environment. Within this frame, this paper presents a 
study (n= 4) investigating on effects of both an exemplary passive and active exoskel- 
eton at an overhead screwing task. The qualitative and quantitative analysis by means 
of a questionnaire study as well as electromyographic investigations reveals signifi- 
cant support potentials of exoskeletons on users in assembly tasks. 


Keywords 


Exoskeleton - Overhead assembly - Ergonomic assessment - Physical support - 
Future workplace - Human-Machine interaction 


L. Ralfs (Ù<) - T. Peck - R. Weidner 

Institute of Mechatronics, Chair of Production Technology, University of Innsbruck, 
Innsbruck, Austria 

e-mail: Lennart.Ralfs @uibk.ac.at 


T. Peck 
e-mail: Tobias.Peck @ student.uibk.ac.at 


© The Author(s) 2023 203 
T. Schüppstuhl et al. (eds.), Annals of Scientific Society for Assembly, Handling and 
Industrial Robotics 2022, https://doi.org/10.1007/978-3-03 1-10071-0_17 


204 L. Ralfs et al. 


1 Introduction 


Despite the increasing trend toward automation and industry 4.0 in production sys- 
tems [1, 2], human operators will remain a central player and factor in industrial factories 
[1, 3, 4]. It is expected that future-proof jobs in production will be characterized by human- 
machine interaction [1, 5, 6], hybrid systems consisting of human and robotic operators 
[2], and a paradigm shift from task-centric to human-centric workplaces [3, 7]. Due to the 
remaining share of manual work, workers will continue to be exposed to the risk of suffering 
from musculoskeletal disorders (MSDs), which are the most common reason for sick leave 
in industrial occupations [8—10]. Concerning assembly tasks, working in particularly stress- 
ful and unergonomic postures as well as repetitive work processes, such as overhead work, 
are a decisive risk factor for causing upper extremity MSDs [8-11] and stress the increasing 
importance of their prevention and an ergonomic work design [5, 7-10]. Support systems 
such as exoskeletons are one possible remedy, with the potential to relieve users during the 
execution of their work [3, 4, 6]. However, exoskeletons are not widely used in the industry 
yet, as evidence of the relief effects of exoskeletons, especially in the long term, is scarce 
[4, 12]. In terms of exoskeletons supporting overhead work, the literature predominantly 
describes studies with passive exoskeletons since no active shoulder-supporting exoskeletons 
are currently available on the market. The article starts at this point and presents a laboratory 
test setup of an overhead assembly task, which allows a combined subjective and objective 
assessment for both an exemplary passive and active exoskeleton. Thus, it operationalizes a 
station from a test course for industrial exoskeletons [13] and enables a pre-study on the exo- 
skeleton’s contributions to user support and future ergonomic workplaces. 


2 State of the Art 


For the multicriteria evaluation of exoskeleton’s supportive effects, a multitude of criteria 
and methods are applied, which are suitable for evaluating different support scenarios to 
varying degrees but are mainly performed in laboratory environments up to now [4, 14]. 
Depending on the desired focus, subjective (e.g., Borg scale, observations) and objec- 
tive (e.g., electromyography, motion capture) evaluation methods are capable of deliver- 
ing results, of which the examination of the physical relief by means of electromyography 
(EMG) is the most frequently used method [14]. However, a comprehensive evaluation 
includes complementary subjective and objective measurement methods [4, 11, 12, 14-17]. 
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Different foci of investigation are set in laboratory settings, relating to the study of 
either singular tasks at workstations or more complex processes at integrated workplaces 
or in test courses [13]. A considerable number of laboratory studies have already been 
conducted to measure muscle activity during overhead assembly tasks. Thus, the unload- 
ing effect of exoskeletons during drilling in different directions, force application points, 
and body postures has already been investigated [18, 19]. In other articles, subjective cri- 
teria are examined in addition to muscle activity. Objective measurements during over- 
head tasks such as drilling, riveting, grinding, or lifting heavy objects are supplemented 
by surveys on, e.g., perceived discomfort and sense of stress [11, 12, 15, 16]. Overhead 
assembly tasks are also studied in industrial environments using EMG and questionnaire 
studies or Borg scales [12, 17]. 

However, almost exclusively passive shoulder-supporting exoskeletons have been 
investigated in previous studies. Therefore, a novel aspect of this work is the comparative 
evaluation of the suitability of both an active and a passive system concerning the sup- 
port effect for an exemplary application scenario. 


3 Materials and Methods 


For evaluating the support effect of exoskeletons, a characteristic overhead assembly task 
was considered, which the subjects performed with and without exoskeleton support. 
Its test setup followed a proposed approach of laboratory-based modeling of industry- 
related tasks [13]. 


3.1 Study Participants 


The study population included four volunteered right-handed males, all of whom were in 
a physically healthy condition and did not report current shoulder pain. The subjects had 
an age between 21 and 24 years (mean: 22.5 years), a height between 174 and 190 cm 
(mean: 180 cm), and a weight between 65 and 83 kg (mean: 75.3 kg). 


3.2 Test Setup 


The task consisted of setting and fastening two bolts side by side in a wooden beam 
(mounted at a reference height of 2.1 m), using an electric screwdriver of mass 2.55 kg. 
The start and end pose of the task were equal, where the screwdriver was held in an 
angled arm position without the tool in reach. During the execution, the two bolts were 
first set and then fixed in the wooden beam by a vertical upward movement of the arm. 
The subjects were not given any specific instructions regarding the speed at which to 
perform the task. However, the execution of the screwing process should be similar in 
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Fig. 1 Assembly task for baseline (left) and supported scenario with Lucy 2.0 (right) 


all runs. For high comparability between runs and subjects, the task was performed in 
a standardized manner. Accordingly, the investigation focused on the screwing as core 
and excluded, e.g., the gripping of the screwdriver. Besides, the mounting position of the 
beam was individually adjusted to the subject’s height allowing subjects to consistently 
perform the task in an upright posture and guide the screwdriver with the dominant hand 
while the non-dominant hand set the bolts. In addition, the lower and upper arms were at 
right angles to each other during the screwing. Each subject performed the screwing in 
triplicate: (A) without exoskeleton support as well as with support by a (B) passive and 
(C) active exoskeleton. Figure | illustrates an excerpt from the task showing the exact 
pose in the baseline (left) and supported (right) scenario. 


3.3 Used Exoskeletons 


A passive (Skelex 360) and an active (Lucy 2.0) exoskeleton were used as examples to 
evaluate the support effect of exoskeletons for overhead assembly tasks. The passive exo- 
skeleton Skelex 360 provides a supportive force when lifting the arms, thus counteract- 
ing the arm’s force of gravity [20]. Two carbon-fiber leaf springs generate support and 
compensate for a weight of up to 3.5 kg per arm [20]. The maximum supporting torque 
equals six Nm [12]. Equal to Skelex 360, the active exoskeleton Lucy 2.0 mainly sup- 
ports the users performing tasks at or above head level [21]. The main difference lies 
in the generation of the supporting force. Lucy 2.0 uses rigid shoulder kinematics with 
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inserted pneumatic actuators for creating the support effect [21]. By this actuation prin- 
ciple, the level of support can continuously be controlled [21] to generate a maximum 
torque of approximately 8.5 Nm at an arm bending angle of 85 degrees [15]. Before per- 
forming the task with exoskeletons, the subjects got familiar with the systems. 


3.4 Applied Evaluation Methods 


For comprehensively evaluating the support effects of both exoskeletons, the assessment 
combines a questionnaire survey of the subjects and an electrophysiological measure- 
ment of muscular activity. In the closed questionnaire study, (1) the perceived exertion 
and (2) the perceived support effect provided by the exoskeletons were asked for after 
performing the task. In contrast, EMG tracks the muscular activities of the medial del- 
toid (shoulder) and the erector spinae (back extensor) during the execution of the task. 
EMG uses surface electrodes and measures electrical signals in the microvolt range emit- 
ted by muscle cells [22]. The EMG sensors were placed on the muscles according to 
the SENIAM guidelines and in the fiber direction. Wireless surface EMG (Myon, Aktos, 
960 Hz) was used during the studies. 


3.5 Data Acquisition and Processing 


Before performing the task, the maximum voluntary contraction (MVC) was meas- 
ured for each subject to determine his peak muscular activity for the later analysis [23]. 
These MVC measurements formed the basis for the subsequent normalization of the 
data. Afterward, the muscle signals were recorded during the execution of the task with 
a frequency of 1000 Hz. However, these raw signals are not sufficient for evaluating 
the effectiveness of exoskeletons. The obtainment of meaningful results requires a data 
transformation of the EMG amplitude to a relative scale (% MVC) [23, 24]. Therefore, a 
four-step procedure is necessary: (a) rectification and filtering of the raw signal (for the 
generation of positive and filtered signals), (b) MVC-normalization (for the elimination 
of the influence of technical, anatomical, and physiological influences as well as for bet- 
ter illustration and comparison of stress levels), (c) activity separation (for cutting the 
relevant activity sequences from the entire signal), and (d) time normalization (for tailor- 
ing and relativization of task durations between subjects) [23, 24]. Statistical parametric 
mapping (SPM) [25] helped analyze and interpret the EMG data. Within this frame, sta- 
tistical methods tested hypotheses for region-specific effects [25] between the baseline 
scenario and the scenarios with exoskeleton support. A nonparametric, unpaired two- 
sample t-test checked the data for mean differences at a significance level of five percent. 
By comparing the scenarios with and without an exoskeleton, the effect on the muscu- 
lar activities was investigated at each point in time. As a result, movement sequences 
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were determined the signals significantly differed and, thus, an effect of the exoskeleton 
existed. Each of the four subjects screwing two bolts per scenario doubled the total data 
pool to eight measurement sets. 


4 Results 


This section describes the results of the studies conducted. First, the results of the ques- 
tionnaire study are presented, followed by those of the EMG study. 


4.1 Results from Questionnaire Study 


The results from the questionnaire study on the (1) perceived exertion and (2) perceived 
support effect provided by the exoskeletons are illustrated using the Borg RPE scale 
(6-no exertion to 20-maximum exertion) [26] and Likert scale (1-low to 5-high), 
respectively. The data are presented as boxplots to visualize the median and standard 
deviation. Additionally, a dot within the boxplot indicates the mean value. 

The first question evaluated the rate of perceived exertion (RPE). For this purpose, the 
subjects assessed their RPE for each of the three executed runs of the task. The left-hand 
chart in Fig. 2 shows the results of this survey. The three boxplots display the evalua- 
tion for the investigated scenarios (A) without exoskeleton support (left plot) as well as 
with support by (B) Skelex 360 (middle plot) and (C) Lucy 2.0 (right plot). For the base- 
line scenario, i.e., executing the task without exoskeleton support, a mean RPE value of 
10.75 was determined. According to the Borg scale, this corresponds to a light perceived 
exertion [26]. Performing the task with the support of an exoskeleton resulted in a mean 
RPE of 8 (Skelex 360) and 7.5 (Lucy 2.0), respectively. These ratings each correspond 
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Fig. 2 Results from study on perceived exertion (left) and perceived support effect (right) 
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to a level of effort perceived as extremely light [26]. However, the width of the boxplots 
illustrates a broader distribution in subjects’ assessments of exoskeletal support com- 
pared to the baseline scenario. Accordingly, there is a higher divergence in evaluating (B) 
and (C). Nevertheless, the RPE mean value notably differs for the supported scenarios 
compared to the non-supported scenario. 

The second question evaluated whether the subjects felt a supportive effect of using 
Skelex 360 and Lucy 2.0. The right-hand chart in Fig. 2 shows the results of this survey. 
The perceived supportiveness of Skelex 360 (with a mean of 4.5) and Lucy 2.0 (4.75) 
was rated as high for both exoskeletons. Accordingly, the subjects’ ratings indicated a 
perceived support effect of both Skelex 360 and Lucy 2.0. 

For both (1) the perceived exertion and (2) the perceived support effect, the evalua- 
tions of the questionnaire study indicate a support effect by Skelex 360 and Lucy 2.0. 
However, since the results so far are only based on the subjective assessment, the results 
of an additional objective measurement of muscle relief are described below. 


4.2 Results from EMG Study 


This section describes the analysis and evaluation of the EMG investigation. Figure 3 
shows the results of evaluating the passive exoskeleton Skelex 360 compared to the 
baseline scenario. As the lower graph of Fig. 3 shows, the t-value exceeds the refer- 
ence value between 45% to 78% and 86% to 94% time relating to the significance level 
(p-value= 0.014). In the subject context, this means the subjects were supported dur- 
ing large time fractions in the second half of the task execution, in which they screwed 
overhead with the dominant hand. Moreover, the peak in significance around 90% of the 
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Fig.3 Analysis of the support effect for Skelex 360 in terms of significance (lower graph) and 
reduction of activity for medial deltoid muscle (upper graph) 
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temporal performance is striking, where the bolt was sunk into the wooden beam with a 
slightly increased force applied. Accordingly, the analysis detects significant support for 
the deltoid muscle by Skelex 360 in the named ranges. On this basis, the curves of the 
relative muscular activity (in % MVC), shown in the upper graph of the figure, can now 
be interpreted. For the significant time portion of the support, the relative muscular activ- 
ity while using Skelex 360 equaled 15% MVC over most of the task execution. Its use 
resulted in a muscular relief for the medial deltoid of 10.8%-points concerning the MVC 
measurement. Appropriately, using Skelex 360 revealed a maximum unloading effect of 
40.6% during the task fraction of overhead screwing. For the other task fractions, there 
was no significance according to SPM. This fact implies the curves do not lead to any 
meaningful interpretation. The same result applies to the support of the erector spinae 
muscle, where no significant support resulted for the entire course of the task. 

Similarly, Fig. 4 shows the analysis results with the active exoskeleton Lucy 2.0 com- 
pared to the baseline scenario. As the lower graph in the figure shows, the t-value exceeds 
the significance threshold over almost the entire task course (p-value=0.014). Conse- 
quently, significant support by Lucy 2.0 was detected for the deltoid muscle over nearly 
the complete task execution (setting the bolts and screwing overhead), except for the last 
five percent of the time (lowering the dominant hand holding the screwdriver). The three 
sections of the movement sequence, beginning of elevating the arm to set the bolts, (first) 
applying the torque during the screwing, and countersinking the bolt in the beam, reached 
the highest significance. For the significant time portion, the relative muscular activity 
using Lucy 2.0 was 10% MVC in the first part of the task (setting bolts) and increased 
to 15% MVC in the second part (screwing overhead). Accordingly, the second part of the 
task required higher muscular activity. Overall, the use of Lucy 2.0 resulted in a relief of 
the medial deltoid muscle of 12.2%-points regarding the MVC measurement. The upper 
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graph visualizes a maximum unloading effect of 49.6% for Lucy 2.0. Equal to the run 
with Skelex 360, there is no significant unload of the erector spinae muscle. 

Consequently, the statistical analysis of this task supports the results of the subjective 
questioning and shows significant support potentials regarding the reduction of muscle 
activity by using both exoskeletons. 


5 Discussion 


In this section, the results obtained in the study are abstracted regarding limitations in the 
study design and results, lessons learned, and implications on future workplaces. 


5.1 Limitations of Studies and Results 


First of all, it is crucial to stress that the results base on the specific test design described 
in Chap. 3.2 and are only valid in this respect. Accordingly, the obtained results depend 
on the task, its execution by the respective subjects, and the exoskeletons used. Limi- 
tations in the test design include the standardization of the processing and the time 
required for test persons to become accustomed to using exoskeletons. Both factors influ- 
ence the execution of the tasks, and, thus the reproducibility of the results since individ- 
ual movement behavior and longer familiarization with the exoskeletons might produce 
different results. In combination with these two aspects, the study was conducted with 
four exclusively young and male subjects in good physical condition, not being the only 
reason why a larger sample is a relevant factor for improved evidence and higher inform- 
ative value of the results. Regarding evaluating the measurement data, the processing and 
cutting of the measurement signals also play a role [23, 24]. All these aspects influence 
the validity and especially the reliability of the results. 


5.2 Lessons Learned from Studies 


The article indicates it is not feasible to make a blanket statement about an (unlimited) 
support of an exoskeleton. The results must always be related to individual sections of 
motions and can only be evaluated against the task. Especially in the example of Skelex 
360, the analysis of the results shows the relevance of dividing the complete task into 
single fractions. The same effect also applies to analyzing the muscular unloading effect 
caused by exoskeletons. Even if the curve progressions show a different level in terms 
of relative strain, no meaningful interpretation is valid unless significance is proven. 
Besides, relieving effects of exoskeletons can only be compared against each other if the 
support characteristics and torques induced by the exoskeletons are identical over the 
course of the angle. As a result, the study stresses the importance of equally considering 


212 L. Ralfs et al. 


subjective and objective criteria in the evaluation, as they can provide complementary 
results. Notwithstanding this, the results of the objective EMG investigation provide bet- 
ter empirical evidence than those of the questionnaire study. 


5.3 Implications on Future Workplaces 


The results reinforce using exoskeletons as a considerable approach while designing sus- 
tainable and ergonomic industrial workplaces. Particularly against the background of trends 
such as human-machine interaction [1, 5, 6] and user-centric workplace design [3, 7], 
exoskeletons can significantly contribute to supporting employees while maintaining their 
flexibility in manual work processes simultaneously [1]. Besides, support systems such as 
exoskeletons offer the opportunity to preserve human skills and abilities [27] and provide 
physical relief at the same time. Thus, using exoskeletons can constitute an attractive and 
human-oriented initiative to maintain the employee’s health. 


6 Conclusion and Outlook 


This article describes the modeling of an exemplary overhead assembly task in a labo- 
ratory environment and its execution in different test scenarios with and without exo- 
skeleton support. The support effects for Skelex 360 and Lucy 2.0 were evaluated. Plans 
include expanding the studies to a larger collective of subjects, tasks, and exoskeletons. 
Additionally, it seems reasonable not only to investigate the effect of exoskeletons in 
terms of physical but also cognitive support. However, within the framework of this 
study, the article provides evidence that passive and active exoskeletons can lead to 
(objectively verifiable) muscular and (subjectively) perceived physical relief in separate 
movement sequences and tasks and, thus, can become a considerable element of ergo- 
nomic and human-centric industrial workplaces with future orientation. 
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Abstract 


The implementation of plug and fly principles for the assembly of high-lift systems 
requires the design of new work processes and jigs. Conventional jigs are inflexibly 
and lag of capabilities to adapt for product and process designs. Adaptive jigs pro- 
vide higher flexibility and allow workers to position the product in accordance with 
their personal needs. This paper presents a novel wheel-shaped adaptive jig and inves- 
tigates its influence on the ergonomics of assembling a high-lift system. A CAD pro- 
gram is used to model the high-lift system, the adaptive jig and workers. Based on 
these models, virtual scenes are generated that can be assessed with the key indicator 
method. The results show that the adaptive jig design improves ergonomics by elimi- 
nating the riskiest tasks during the assembly process. The adaptive jig concept has a 
high potential for improving production processes because it can respond flexibly to 
changes in the assembly process as well as to changes in product design. 


Keywords 


Assembly - Ergonomics - Modelling 


S. Hogreve (È<) - H. Wallmeier - K. Tracht 
Bremen, Germany 
e-mail: hogreve@bime.de 


© The Author(s) 2023 215 
T. Schüppstuhl et al. (eds.), Annals of Scientific Society for Assembly, Handling and 
Industrial Robotics 2022, https://doi.org/10.1007/978-3-031-10071-0_18 


216 S. Hogreve et al. 


1 Introduction 


The assembly of large-scale flight systems is an essential part of the value-added process 
in the aviation industry. Fuselages, wings, engines and other components are assembled 
by hand in cycle lines. In wing outfitting, the assembly of high-lift systems is tradition- 
ally done by mounting the individual components directly to the wing box. With a plug 
and fly assembly concept, the high-lift system can be pre-assembled, adjusted and tested 
as a stand-alone unit [1]. The ready-to-fly module is then joined to the wing in the final 
assembly line (FAL) with only a few joints. By outsourcing the assembly of the high-lift 
system, the cycle time in the wing outfitting can be shortened and the factory production 
rate increased. In addition, pre-assembly of the high-lift system can improve the ergo- 
nomics of assembly because the subassembly offers improved accessibility and can be 
moved to an ergonomically favourable position and orientation with less effort. For such 
a modular design of the wing, no reference concepts for the assembly of the high-lift sys- 
tem exist yet. Both the assembly organisation and the required operating equipment must 
be rethought. 

The use of assembly jigs for precise and repeatable assembly of the components is 
widespread in the aerospace industry [2]. They are required to ensure accurate joining 
operations during the assembly of large dimensional aircraft components such as wings 
and high-lift systems. The jigs must be rigid and precise, and must therefore be matched 
to the product and the assembly process in question. This results in inflexibility with 
respect to shape and dimensional changes of the product [3]. New innovative jigs are 
needed to ensure high manufacturing accuracies and flexibility. They must be able to 
position large components easily and be flexible at the same time [3]. To combine high 
productivity and flexibility, assembly fixtures need a higher degree of automation. They 
also need to provide greater adaptability and improved interaction with workers [4]. To 
meet these requirements, collaborative robots seem suitable. They can relieve humans 
and protect them from physical overload by taking over heavy and repetitive tasks [5]. 

The aim of the research work presented here is the development of an adaptive jig that 
offers a high degree of adaptability with regard to product as well as process changes. 
The jig should enable the entire assembly process of the high-lift system in one clamp- 
ing. In addition, it should offer workers the possibility for individual adjustments of the 
working position in order to improve both physical and cognitive ergonomics. The con- 
cept of such an adaptive jig is presented in [6]. The adaptive jig uses collaborative robots 
to position the components to be assembled. Consideration of physical and cognitive 
ergonomics has gained importance in the design of systems where humans and robots 
collaborate [5]. Several research efforts focus on methods to simulate physical and cog- 
nitive ergonomics with models and evaluate the acceptance of the human-robot-collabo- 
ration before building the real device. Beuß et al. propose an ergonomics study based on 
a simulation and virtual reality [7]. They describe that the analysis with digital humans 
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is possible. Fritzsche shows the high degree of agreement between real and virtual ergo- 
nomics assessments and thus proves their usefulness [8]. 

This paper investigates the influence of the adaptive jig on the physical ergonomics. 
Using the assembly of a high-lift system as an example, the adaptive jig is compared 
with a rigid jig. Human modelling in a CAD system is used for a first evaluation of the 
ergonomic potential. The key indicator method is used to measure the risk of physical 
overload. The investigation will show if the adaptive jig is suitable for assembly and if 
it offers at least an equivalent ergonomic potential as a rigid jig. Based on the results, 
a decision can be made for or against building a physical demonstrator and conducting 
real-world tests. 


2 Product and jig Design 
2.1 Assembly Object 


The investigated product is a high-lift system of a medium-range jet. However, only 
the assembly of the outboard landing flap with the associated supports is considered. 
Figure | shows a section with the main elements. The basic component is the aero flap 
support (AFS). It carries the moving components of the high-lift system and at the same 
time acts as an aerodynamic fairing [1]. The flap lever, actuator and landing flap are con- 
nected to the support by means of bolt connections. The high-lift unit is connected to the 
wing box at three points through the main and forward attachments. All bolt connections 
are designed to be fail-safe, i.e. they consist of two bolts slid into each other in opposite 
directions, which are fixed with lock nuts and locking plates. Since each bolt connection 


Outer Flap Main Attachment Main Attachment 
Floating Bearing Fixed Bearing 


Forward 


. . Attachment 
Main Bridge 


Brackets 


Flap Lever 


MF-BSA AFS 


Fig.1 High-lift system with bolt connectors [9] 
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consists of at least six components, depending on the design, the components are sum- 
marised in the following under the term assembly kit (AK). It is assumed that the com- 
ponents are provided ready for assembly at the assembly line. Manufacturing operations 
such as drilling, milling, deburring or surface treatment are not part of the consideration. 
For example, it is assumed that the main bridge and other metal brackets on the AFS are 
already joined when it arrives at the pre-assembly line. 

To compensate for angular errors and reduce stresses, e.g. during thermal expansion, 
spherical bearings are integrated in all connection points. Until assembly is completed, 
all components are therefore movable in several degrees of freedom in relation to each 
other and must therefore be supported and held in position by a fixture. 


2.2 Assembly Devices 


Adaptive Jig. Figure 2 shows a raw construction of the adaptive assembly device. The 
positioning and orientation of the assembly parts is taken over by industrial robots, 
which have suitable end effectors for clamping the components. The industrial robots 
are arranged on a circular seventh axis. Due to its shape, this adaptive assembly device is 
also called assembly wheel [6]. The redundant kinematics give the robots an additional 
degree of freedom. This can be used to move the robots during assembly into a favour- 
able position that interferes least with the workers’ work process. In this way, accessibil- 
ity to the assembly points can be increased. 

The first robot carries the assembly while other robots feed the assembly components 
and position them for the joining process. Workers then assemble the bolt connectors 
manually. The robots are able to change the position of the assembly in space so that 
workers of different heights can comfortably work on the object. Furthermore, chang- 
ing the orientation can prevent working overhead or while kneeling. To ensure safe 
operation, the robots must be equipped with functions for human-robot collaboration, 
like force sensors and robot skin. During assembly, each support is initially equipped 
in a separate assembly wheel. Then the assembly wheels with the supports are brought 
together and the landing flap is added. 


Fig. 2 Adaptive Jig with support (left) and high-lift system (right) [6] 
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Fig.3 Rigid jig holds support 
and landing flap 


Rigid jig. Since the Plug and Fly high-lift system is a completely new product for 
which no reference process exists so far, a rudimentary concept for a rigid jig had to 
be created for this study. Ergonomic requirements were considered in the design, just 
as they would be in an industrial design. It is assumed that the construction consists of a 
rigid frame of welded hollow sections. Functions for adjusting the height or orientation 
of the assembly are not integrated. In order to provide an approximately optimal working 
height for all workers, an average working height of 1100 mm was set. 

The supports are inserted into a clamping device and fixed therein during assembly. 
The assembly is carried out in horizontal orientation, which corresponds to the flight ori- 
entation. The landing flap is assembled in the extracted condition to improve accessibil- 
ity to the joining points. Figure 3 shows a CAD representation of the concept for the 
rigid jig. Since only the general contour and the geometric arrangement of the assembly 
parts in the jig are relevant for determining the influence on an ergonomic working pro- 
cedure, details such as the clamping devices were not designed. 


3 Process Design 

To evaluate the effects on ergonomics, the fixtures must be considered in the context of a 
work process. For the assembly of the high-lift system, an assembly sequence was deter- 
mined experimentally in workshops [10]. Based on this assembly sequence, work steps 
have been defined and an allocation of labor in the cycle line is determined. 


3.1 Determination of the Work Steps 


The key indicator method [11] is used to compare the effects on ergonomics. It is used 
to evaluate the work processes both when using the adaptive jig and when using the rigid 
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jig. The risk values can then be compared in conclusion. Depending on the type of load, 
different forms must be used in the KIM. Each of the assembly processes considered is 
therefore first broken down into work steps, each of which contains only operations of a 
uniform load type. For the assessment of the assembly of the high-lift system, the forms 
for the assessment of Lifting, Holding and Carrying of loads (LHC) as well as for the 
assessment of Manual Handling Operations (MHO) are sufficient. Methods-time Meas- 
urement (MTM) was used to determine the working times for these work steps. Table 1 
shows an overview of the worksteps defined for the two work processes. 


3.2 Work Scheduling 


Assuming that 63 aircraft are to be produced per month and that 17 shifts of seven hours 
each are available per week, this results in a maximum cycle time of approximately 
3.75 h per wing (i.e. outboard high-lift system). The actual assembly time per high-lift 
system must be shorter than the cycle time. The assembly time comprises the basic time, 
the recovery time and the distribution time. The basic time is formed by the sum of the 
MTM values of all work steps. The recovery time corresponds to legal requirements and 
the distribution time is estimated based on values from the Federal Ministry of the Inte- 
rior and Community. Since some work steps can only be performed by two people, at 


Table 1 Work steps for the assembly process with adaptive jig 


Rigid Jig Adaptive Jig 

No. Work step KIM |No. Work step KIM 

1 Positioning fairing LHC 1.1 Fixing fairing in jig MWP 

2.1 Fixing fairing in jig MWP 12 Fixing fairing in jig MWP 

2.2 Fixing fairing in jig MWP |2 Providing actuator LHC 

3a Mounting actuator to MWP 3 Mounting actuator to MWP 
main bridge main bridge 

3b Holding actuator LHC 

4a Mounting flap lever to MWP |4a Mounting flap lever to MWP 
flap front flap front 

4b Holding flap lever LHC | 4b Holding flap lever LHC 

$ Mounting flap lever to MWP 5 Mounting flap lever to MWP 
flap rear flap rear 

6 Mounting flap lever to MWP 6 Mounting flap lever to MWP 
fairing fairing 

Ta Mounting actuator to flap |MWP |7a Mounting actuator to MWP 

flap 
7b Holding actuator LHC |7b Holding actuator LHC 
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least two workers must be scheduled to perform the assembly. Detailed work planning 
shows that at least three people are required to complete all work steps in the required 
cycle time. This applies to assembly with the adaptive jig as well as to assembly with 
the rigid jig. For both assembly processes, a task allocation is carried out in which three 
people are equally occupied. The basic assembly time is then approximately 2.8 h per 
high-lift system. This leaves sufficient recovery and distribution time within the cycle 
time. The allocation of the tasks is included in the determination of the key indicators 
in the following chapter and is primarily represented there by the task durations and the 
number of repetitive movements. 


4 Determination of Key Indicators 


The key indicator method (KIM) has become a standard across companies to evaluate 
the ergonomics of a working process [12]. In the automotive industry for example, it is 
used to evaluate assembly activities. The different key indicator methods are designed as 
a basic methodological for the risk assessment. They describe the most important stress 
factors (key indicators) in ordinal scales and determine the degree of likelihood of phys- 
ical overload [11]. The methods are well evaluated and digital forms are provided for 
easy execution [13]. 

To carry out the KIM, the postures of the workers during the assembly processes have 
to be observed and the assembly times have to be determined. Since the study is con- 
ducted before the concept is finalised and the jig is actually built, the study takes place 
with virtual objects. Both the adaptive and the rigid jig exist as CAD models. The pro- 
gramme Siemens NX 11 is used to integrate human models into the CAD models. Four 
different human models, representing the Sth and 95th percentile of the male and female 
German population respectively, are used for the investigation. Each of the previously 
defined work steps is reproduced in a CAD scenario. In each case, a posture that is char- 
acteristic for the assembly step is simulated with the human models. Based on these sce- 
narios (e.g. Figure 4), the posture can then be evaluated within the scope of the KIM. 
In addition, the human models are used to check whether there is sufficient visibility of 
the assembly spot and the workers’ own hands. The key indicators that cannot be clearly 
identified in the virtual model (such as work organisation) are always assumed to be best 
possible. 

Table 2 shows and 3 show the results of the KIM for both assembly jigs. The risk 
scores are classified in four categories. Risks below 20 correspond to a low load of inten- 
sity and risks between 20 und 50 indicate a slightly increased load. Both are acceptable. 
Risk values between 50 and 100 belong to substantially increased load intensities and 
afford a redesign of the workplace since physical overload is possible for normally resil- 
ient persons. Work steps with a risk value above 100 have a high load intensity and will 
likely cause physical overload to all persons. [14]. 
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Investigated person: Here, 
female model representing 
5th percentile 


Any further person: 
Always male model 
representing 50th 
percentile 


Fig.4 Example of posture and view analysis with human models in Siemens NX 11 (Visibility of 
jig is deactivated.) 


Table 2 Work steps and corresponding risk values for the assembly process with rigid jig 


No. | Work step KIM | MTM 

1 Positioning fairing LHC 1 min 

2.1 | Fixing fairing in jig MWP | 32 min 

2.2 | Fixing fairing in jig MWP | 20min 

3a | Mounting actuator to | MWP 7 min 
main bridge 

3b | Holding actuator LHC 2 min 

4a | Mounting flap lever to | MWP 11 min 
flap front 

4b | Holding flap lever LHC 3 min 

3 Mounting flap lever to | MWP 17 min 
flap rear 

6 Mounting flap lever to | MWP 13 min 
fairing 

7a | Mounting actuator to | MWP 10 min 
flap 

7b | Holding actuator LHC 3 min 


95.9 | 5.0 | 95. 5. 


It can be seen that most works steps cause a low or slightly increased physical load. 
However, the holding processes may cause an increased load especially for female work- 
ers. The rigid jig requires the manual positioning of the fairing within the jig. This work 
step causes loads that are too high for women. So this work step needs to be redesigned 
if the rigid jig should be use. While using the adaptive jig the highest risk occurs during 
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Table 3 Work steps and corresponding risk values for the assembly process with adaptive jig 


No. | Work step KIM | MTM 
1.1 | Fixing fairing in jig MWP | 32 min 
1.2 | Fixing fairing in jig MWP | 20 min 


2 Providing actuator LHC 1 min 
3 Mounting actuator to | MWP | 7 min 
main bridge 
4a | Mounting flap lever to | MWP | 11 min 


flap front 

4b | Holding flap lever LHC 3 min 

5 Mounting flap lever to | MWP | 17 min 
flap rear 

6 Mounting flap lever to | MWP | 13 min 
fairing 

7a | Mounting actuator to | MWP | 10 min 
flap 

7b | Holding actuator LHC 3 min 


the providing of the actuator to the robot. This step can easily be eliminated by provid- 
ing the actuator with a carrier. While the values in Table 2 and 3 only represent the risk 
of single work steps, Table 4 and 5 show the cumulated values for a whole working day, 
considering the working process described in chapter 3.2. Since workers perform differ- 
ent tasks during their shift, the risk is evenly distributed. Only women have a high risk 
of physical overload during lifting and holding since they all have to position the fairing 
together when using the rigid jig. 


Table 4 Cumulated risk values per working day and person with rigid jig 


Percentile 95. 8 5.8 95.9 5.09 
Method 
Person 1 

Person 2 
Person 3 


Table5 Cumulated risk values per working day and person with adaptive jig 


Percentile 95. 3 5.8 95. Ọ 5.09 
Method 
Person 1 

Person 2 
Person 3 
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5 Conclusion and Summary 


It has been shown that the adaptive jig offers the same or even better ergonomic perfor- 
mance compared to a rigid jig. In particular, it provides very good support when posi- 
tioning heavy objects. The adaptive jig therefore improves the work process. It can be 
assumed that work processes such as adjustment, electrical equipment or painting also 
benefit from the adaptive jig. In particular, when several people with different physi- 
cal constitutions work together and when integrating people with physical disabilities, 
the adaptive device offers further potential for improving ergonomics. For example, the 
assembly object can be brought into an orientation where the assembly points are pre- 
sented to the employees at different heights. 

The adaptive design makes it possible to carry out the entire production process of a 
high-lift system in a single setup. There is no need for relocation and remeasurement in 
another fixture. Further potential arises when considering the adaptive jig over its life 
cycle. The jig can be adapted to changes in the production process or to product changes 
with little effort. This results in a very long service life. Because not all aspects of work 
ergonomics could be investigated with the chosen method (e.g. psychological factors), 
the use of a physical demonstrator is necessary for a complete evaluation. 
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Abstract 


In recent times, learning by demonstration has seen tremendous progress in robotic assem- 
bly operations. One of the most prominent trajectory-level task models applied is Dynamic 
Movement Primitives (DMP). However, it lacks the ability to tackle complex operations 
as often encountered in industrial assembly. Augmenting low-level models with a high- 
level framework in which different movement segments are deliberately parameterised 
is considered promising for such scenarios. This paper investigates the combination of 
trajectory-level DMPs with Methods-Time Measurement (MTM). We demonstrate how 
the MTM-1 system is utilised to establish distinguished DMP models for five of its basic 
elements, paving the way to benefitting from the sophisticated MTM system. The evalu- 
ation of the framework is conducted on a generic pick and place operation. Compared to 
a one-model-fits-all DMP approach for the whole task, the proposed method shows the 
advantage of appropriate temporal scaling, accuracy levelling and force consideration at 
adequate times. 
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1 Introduction 


With the shift from mass production to mass customisation [1] in combination with an 
increased labour shortage deemed through an unfavoured demographic change [2], the com- 
petitiveness of tomorrow’s assembly industry is dictated by flexible and easy-to-program 
automation systems. A solution is promised by the concept of learning by demonstration, 
which endows a robotic system with the ability to be programmed through intuitive demon- 
stration methods [3]. In recent years, task models based on trajectory-level approaches 
including Dynamic Movement Primitives (DMP) have prevailed in successfully reproduc- 
ing assembly-related movements based on human demonstration [4]. 

As Dynamic Movement Primitives minimise the teaching time through one-shot learning 
and are capable of reproducing accurate trajectories with temporal and spatial scalability 
[5], key requirements are considered satisfied for the competitiveness in the industrial envi- 
ronment. However, handling complex tasks is still a major bottleneck of DMP and other 
trajectory-level task models [4]. 

In this work, the promising concept of embedding trajectory-level models within high- 
level symbolic task representations to tackle complex tasks is further investigated [3, 6]. 
Compared to other approaches in which often unsophisticated and limited frameworks 
were considered, the proposed optimised DMP framework utilises the industry-established 
Methods-Time Measurement (MTM) system which provides a comprehensive and elabo- 
rated structure for assembly tasks. Hence, the two fundamentally proven methods of DMP- 
based learning by demonstration and assembly task analysis according to MTM are com- 
bined to create a solution to the situation outlined above. 

The remainder of the paper is organised as follows. Section 2 outlines the background and 
state-of-the-art for Dynamic Movement Primitives and Methods-Time Measurement. Our 
conceptual framework towards an industry-oriented MTM-1 based optimised DMP frame- 
work is depicted in Sect. 3. Section 4 provides an experimental validation of the framework 
on a generic pick and place operation, followed by a conclusion and discussion of future 
work in Sect. 5. 


2 Background 


This section summarises the theoretical background of Dynamic Movement Primitives and 
Methods-Time Measurement and the state-of-the-art relevant to the proposed framework. 


2.1 Dynamic Movement Primitives 


Dynamic Movement Primitives were initially introduced by Schaal et al. [7] in 2003. The 
revised formulation by Saveriano et al. [8] is considered the state-of-the-art for Cartesian 
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space Dynamic Movement Primitives (CDMP) and was used for the proposed framework. 
Here, the task space is divided into two transformation systems formed as second-order 
dynamical systems to capture the translational (1) and rotational dimensions (2). 


tvV=K?[(p, —p) — (Py — Po) s + fP?(s)] — D? v + x? a) 
Tp=v 

tT @ = K [eo(q;. q) — e&o(g,g0)5 + Fis)] - Diw+ x4 (2) 
tq= 3510, o! |" x q 

The position, linear velocity, and acceleration are symbolised as p, v, ý € R3. q € SO(3) 
represents a unit quaternion with w, @ € R? being the angular velocity and acceleration, and 
€0(q;, qj) is defined as the oientation error between q; and q;. The parameter t facilitates 
the temporal scaling and the scalar s creates the time independency through the canonical 
system. The constants po, qo and p,, q, stand for the start and goal poses, respectively. The 
positive definite matrices K’, D’ are stiffness and damping gains. The forcing terms f" (s) 
preserve the non-linear behaviour of a demonstrated trajectory through weighted Radial 
Basis Functions w, Yn (s) (RBF). The term x’ represents any extension to the dynamical 
system. For an in-depth explanation of DMPs see [5]. 

In preliminary works on DMPs by Schaal et al. [6] in 1999, a compact state-action-state 
sequence is shown to be a natural prerequisite for task imitation with movement primitives 
expressing states as aligned, in contact, near-to, and actions as move-to, grasp-object, move- 
above, etc. Such a combination of low- and high-level task representation is still promoted 
for handling compounded actions [3]. Following the assumption that most human hand 
movements can be segmented into reach, manipulation and withdraw phases, Mao et al. [9] 
reproduced a chopping task by identifying grasp/release transitions and key manipulation 
points. Aein et al. [10] developed a three-level task model architecture based on an action- 
grammar analogy. The low-level controller possessed arm movement primitives for position 
and force control and hand primitives for open, close, grasp, and ungrasp. Eiband et al. [11] 
defined four robot skills, including gripper open, gripper close, free movement, and haptic 
exploration, to establish a tree that describes geometric relationships between consecutive 
skills. Complex dual-arm household tasks were investigated by Caccavale et al. [12], result- 
ing in a low-level segmentation based on object proximity (near/far) and explicit human 
commands (open/close gripper) combined with a high-level attentional behaviour-based 
system to structure identified movement primitives. 

While reasonable symbolic frameworks have been explored, none is based on a sophis- 
ticated industry-proven structure, limiting their probability to endure realistic industrial 
assembly operations. 


232 V.H. Moreno et al. 


Table 1 Properties of MTM-1 basic elements after [13] 


Basic Element Predominant purpose Influencing factors for the 
manual assembly operation 
Reach Move the hand to a destination Distance of motion, condition of 
or general location the target object, and pre/post 
velocity of the hand 
Grasp Secure sufficient control of one | Properties of the object such as 
or more objects with the fingers | the size of the available contact 
or the hand area and its surroundings 
Move Transport an object to a Transported object, and pre/post 
destination velocity of the hand 
Position Align, orient, and engage one Insertion tolerance, joining 
object with another object pressure, and the object’s 
symmetry 
Release Relinquish control of an object | Opening the fingers or letting go 
by the fingers or hand 


2.2 Methods-Time Measurement 


Introduced in 1948 by H. Maynard et al. [13], Methods-Time Measurement ranks among 
the most established predetermined motion time systems in today’s industrial market. The 
MTM-I variant, designed to analyse short-cycle repetitions, is provably capable of seg- 
menting most manual assembly-related operations and methods. The proposed framework 
is build on five of its basic elements, namely reach, grasp, move, position, and release. Their 
definition is provided in Table 1. 

Besides the intended use for designing workplaces and work methods, MTM proves to 
be valuable for the field of robot science. Drumwright et al. [14] developed primitive actions 
for task-level programming of humanoid robots based on the MTM-1 basic elements. With 
the growing interest in establishing human-robot interaction, the MTM-1 framework was 
assessed for the analysis of robot incorporated workspaces [15]. Finally, recent research has 
explored how to automate the classification of handling tasks according to MTM-1 using 
machine learning techniques [16]. The latter promises to greatly simplify its applicability 
in the proposed learning by demonstration context. 


3 The MTM-based Optimised CDMP Framework 


The proposed framework for tackling complex assembly tasks embeds the tra- jectory- 
level CDMP model within the industry-established MTM-1 system as the high-level task 
representation. Compared to other approaches, it establishes the benefits of a comprehensive 
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and proven structure for industrial assembly tasks and considers distinctive properties from 
individual subskills. In this Section, customised CDMP models are designed to reflect the 
differentiating properties of the five basic elements of the MTM-1 system. The MTM-based 
optimised CDMP framework is summarised in Fig. 1 and explained in detail below. 

REACH—The sequence of subskills commences typically with reaching towards a work- 
piece, where time efficiency and movement generalisation are essential. Since the covered 
distance primarily dictates the time efficiency during the reach subskill, the temporal scaling 
property of CDMP models becomes valuable, especially when the task was demonstrated 
under reduced speed. It is achieved by amending the time constant T, resulting in an effort- 
less adjustment of the robot’s end-effector velocity during reproduction. Since the accuracy 
is considered less important when approaching the workpiece, the number of RBF is rec- 
ommended to be chosen low. By doing so, a smoother trajectory is created, removing shaky 
discrepancies, and the computational costs are reduced. Considering human demonstrations 
being often non-optimal for the robots kinematic, the weights w, may be further optimised 
using reinforcement learning [5]. 

Besides the temporal scaling property of CDMP models, the spatial scaling option creates 
additional advantageous characteristics for this subskill. While CDMP models can inherently 


A Reduced time constant for faster reproduction 
y PAK N Less RBFs for smoothening and reduced comp. costs 

4 | N Path optimisation through reinforcement learning (RL) 
f Goal pose pattern recognition 
Object avoidance extension 


Reach 


Increased time constant for secure performance 
Grasp More RBFs for increased accuracy 


Optionally force/visual feedback 


Reduced time constant for faster reproduction 


a Less RBFs for smoothening and reduced comp. costs 
ove 
Path optimisation through RL including object 


Object avoidance extension including object 


Increased time constant for secure performance 
Position More RBFs for increased accuracy 


Consideration of wrench for improved positioning 


Increased time constant for secure performance 
Release More RBFs for increased accuracy 


Assessment of occurring wrench for damage control 


Fig.1 The MTM-optimised CDMP framework (Remark: The specific parameterisation is subject to 
the robot’s capabilities and the application’s requirements) 
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cope with deviating starting poses, the goal pose p,, q, is also adjustable in real-time 
through a goal switching mechanism as described in [5]. An object recognition method 
may be applied to detect different workpieces and identify a quantifiable goal pose for the 
reach CDMP model. Finally, the CDMP model of the reach subskill generalises further by 
adjusting its trajectory in case obstacles appear on its path. This can be realised through an 
CDMP extension for volumetric object avoidance which was explored in [17]. 

GRASP—After reaching the target position close to the workpiece, the grasp subskill 
commences. In contrary to the reach subskill, a much shorter distance is to be bridged. 
However, it does require a higher accuracy as a distinguishing characteristic, which dictates 
the success of the grasp operation. 

Based on this requirement, the number of RBF replicating the demonstrated grasp sub- 
skill is recommended to be chosen high. To reduce the risk of damaging inertia forces or 
control limitations, a similar or slower reproduction speed than the demonstrated scenario 
is desirable and realised by increasing the time constant t in the CDMP model. As far as 
the hardware setup permits it, additional visual or force feedback may be considered to 
improve the accuracy further. Finally, the gripper actuation may be reproduced through a 
simple DMP model under the same canonical system to guarantee correct actuation timing. 

MOVE-—-Once grasped and lifted sufficiently to allow free movement, the move subskill 
is initiated to transport the workpiece close to its destination. As this subskill also focuses 
on a large motion in which the accuracy is considered less relevant, the same efficiency 
and generalisation ideas as in the reach element apply. Nevertheless, the properties of the 
transported workpiece have to be considered. This includes its weight, dimensions, and 
fragility. 

In accordance with the requirements, the time constant t is adjusted appropriately but 
may be increased to improve time efficiency. A lower number of RBF to reproduce the 
demonstration trajectory allows smoothing out shaky demonstration motions and reduces 
computational costs. When considering optimising the weights of the forcing terms f’ (s) 
through reinforcement learning, as discussed for the reach subskill, the workpiece dimen- 
sions must be included. 

Regarding generalisation capabilities, the starting pose is provided by the end pose of 
the preceding position CDMP model outcome. End pose adjustments may be incorporated 
in real-time as discussed for the reach subskill. Similar to the reinforcement learning aug- 
mentation, the workpiece dimensions must be considered when applying object avoidance 
methods. 

POSITION—The position subskill describes the most challenging aspect of an assembly 
task. It covers aligning, orienting, and engaging the grasped workpiece with its designated 
location relative to another object. Similar to the grasp subskill, accuracy is a vital factor for 
the success of this subskill. However, a fundamental characteristic during positioning is the 
occurrence of contact forces and torques which can significantly influence the appropriate 
execution. In order to improve the accuracy, a high RBF density is recommended to replicate 


Towards Learning by Demonstration for Industrial Assembly Tasks 235 


the demonstrated motion. Since accurate execution is of more importance than its speed, a 
suitably low time constant t may be selected. 

Beyond the achievable positional accuracy, the consideration of contact forces and torques 
promises to enhance the robustness of the position subskill. Therefore, these should be 
incorporated in the CDMP model, which can be realised in different ways [5]. 

RELEASE—tThe position subskill terminates when the workpiece is successfully 
aligned and oriented, and no interfering forces are recorded. Once this state is reached, 
the release subskill commences by actuating the gripper and ends after a collision-free dis- 
engagement from the workpiece. Like the preceding position subskill, a continued high 
accuracy and reduced reproduction speed characterise the release CDMP model. An assess- 
ment of noticeable forces may be used to guarantee no intervention with the workpiece 
during disengagement. 


4 Experimental Evaluation 


The proposed MTM-1 based optimised CDMP framework was evaluated on a generic pick 
and place experiment. Here, a toy dice (8 cm x 8 cm x 8 cm) is to be picked up from its 
initial location and to be placed onto a stationary assembly jig with an 9 cm x9cm x 1 cm 
recess (see Fig. 2). Based on a human demonstration via kinesthetic teaching, the task was 
reproduced using the MTM-based optimised CDMP framework and then compared with 
two one-model-fits-all CDMP models with distinguished accuracy levels. 


OnRobot RG6 Gripper | 


Initial Position 
j p = Assembly Jig 
ATI Axia80 FT Sensor Ki 
y _—_— 


Fig.2 Experimental setup 
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4.1 Experimental Setup 


The experiment was conducted on an URSe robot from Universal Robots with OnRobot RG6 
gripper. An ATI Axia80 F/T sensor installed in the assembly jig measured the wrench during 
positioning (see Fig. 2). The free drive mode of the URSe was used for demonstration. End- 
effector cartesian poses were recorded 100 Hz while the gripper was actuated manually using 
the teach pendant. During the transportation of the workpiece, an artificial disturbance was 
introduced by shaking the end-effector for a short time. The desired transitions between the 
five subskills were communicated from the human teacher by briefly pausing the movement. 
For the MTM-based CDMP framework, the demonstration data was separated into the 
subskills and fed to individual CDMP models as described in Sect. 3. In accordance to the 
proposed framework, the reach and move CDMP models were simplified with 10 RBF and 
doubled in speed by halving the time constant r. In contrary, the grasp, position, and release 
CDMP models were generated with 200 RBF to improve their accuracy and the same time 
constant t as during demonstration. All other CDMP parameters were kept the same across 
subskills, including KŻ as 100, Dİ being critically damped, the canonical system’s parameter 
& = —In(0.001) x T, and RBF centres equally distributed in time with a width of 2. As 
the subskill transitions occurred without velocities, the final merging of individual CDMP 
sequence was realised by the suggested approach of Saveriano et al. [8], with the initial poses 
being the end poses of the prior CDMP subskill. The generalisation of the starting pose was 
examined by introducing an offset of +3 cm in each translational dimension to the starting 
position of the demonstration data. For comparison, the one-model-fits-all CDMP approach 
was used twice with 10 and 200 RBF per subskill, no temporal scaling, and all other CDMP 
parameters being equivalent. During reproduction, the gripper was actuated manually by the 
human operator. The offline processing of the demonstration data and CDMP calculation 
was conducted in MATLAB (the code is accessible at https://github.com/VictorHerMor/ 
2022-mtm-based-dynamic-movement-primitives-mhi). 


4.2 Results and Discussion 


Figure 3 shows the translational dimensions of the proposed MTM-based optimised CDMP 
framework compared to the one-model-fits-all CDMP approaches, from which four essential 
differences are observed. The reach and move subskill duration are indicated in green and 
blue highlighted areas within the y-dimension graph. Its comparison to the demonstration 
data shows that the desired end pose was reached after half the time of the respective 
subskill, reducing the whole reproduction duration by approximately 10s. The introduced 
3 cm offsets in each translational dimension were eliminated during the reach subskill, 
demonstrating the capability of coping with distinguished starting positions (green circles). 
The artificially introduced disturbance during the move subskill (around 30s, blue circles) 
was smoothed out in the MTM-based optimised CDMP framework, while the required 
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Fig.3 Translation during demonstration (——), one-model-fits-all CDMP approach with 10 RBF per 
subskill (— ), and MTM-based optimised CDMP framework (—) 


accuracy during the grasp, position and release subskill were maintained. The discrepancy 
of the latter feature to the one-model-fits-all CDMP approach with 10 RBF per subskill is 
highlighted with red circles, where critical dips in the z-dimension appear. Furthermore, 
while a high accurate one-model-fits-all CDMP approach (200 RBF per subskill) matches 
accurately the demonstration data, including the artificially introduced disturbance during 
the move subskill. However, its computational costs are 36 % higher than for a one-model- 
fits-all CDMP alternative with only 10 RBF per subskill. In comparison, the MTM-based 
optimised CDMP framework increases the computational costs by only 6 %. 

Figure 4 shows the measured forces in the z-direction during the position subskill. Based 
on a post-assessment of the force profile, the data verifies that the dice was successfully 
placed on the assembly jig through an identical end value. Furthermore, the occurred forces 
during reproduction did not exceed those during the demonstration, suggesting a damage- 
free task replication. 

In summary, the distinction between the five MTM-I basic elements and the design of 
characteristic CDMP models bring the decisive benefit of focusing on their unique require- 
ments, paving the way to tackling compounded and complex assembly operations. 
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Fig. 4 ATI Axia80 z-force measurement during the position subskill (— human demonstration, 


—— MTM-based optimised CDMP framework) 


5 Conclusion and Future Work 


While Dynamic Movement Primitives are considered a promising approach for robotic 
learning by demonstration, their stand-alone application lacks handling complex assembly 
tasks. This paper has presented a method to address this limitation by distinguishing subskills 
on a symbolic level provided by the industrially well-established MTM-1 framework. By 
doing so, five unique CDMP models were defined, which are designed to match the individual 
characteristics of the MTM-1 basic elements reach, grasp, move, position, and release. 
The proposed method was evaluated on a pick and place assembly task, showing more 
decisive benefits than the one-model-fits-all CDMP approach. These include appropriate 
time management, matching accuracy in relevant periods of the assembly task and force 
monitoring at adequate times. 

With the presented experimental results demonstrating its proof-of-concept, the frame- 
work’s optimisation shows potential for further analysis. While the proposed approach relies 
currently on the author’s expertise to parameterise the CDMP models, a sophisticated mathe- 
matical analysis regarding the design decisions and their implementations will provide more 
robustness to the system. On the other side, the proposed method’s full potential is yet to be 
explored, including its analogy to human efficiency with the predetermined motion-time and 
further abstraction through elaborated subsequent MTM variants. Finally, the transferability 
and generalisation to other robot systems and applications will be exploited in future work. 
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Visual Programming of Robot Tasks with Product 
and Process Variety 


Dominik Riedelbauch and Sascha Sucker 


Abstract 


In flexible manufacturing settings, automation is shaped by ever changing conditions 
(e.g. varying part feeding locations, highly customizable products). Quick adaptation 
of robot systems is mostly achieved by visual end-user robot (re-)programming. In this 
paper, we discuss the explicit integration of anticipated product and process variety into 
visually programmed tasks. We contribute a task model which captures a user-defined 
range of task variants. To this end, parts are specified in terms of approximate locations 
and generalized parts families. Workspace exploration and combinatorial assignment 
planning enable online adaptation to unknown environments. Our experiments show that 
this adaptation capability can increase the economical efficiency of cobot use. 
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1 Introduction and Related Work 


Contrasting to traditional mass production, manufacturing demands have shifted towards 
shorter innovation cycles and small-batch production. This has raised the demand for flexible 
manufacturing systems that can quickly be adapted to customized products by domain- 
experts in small and medium enterprises [6]. When additionally considering recent advances 
in collaborative robotics towards flexible partial automation, adaptation of robot programs 
to various sources of variety are needed [1, 4]: Product variety is needed to manufacture 
different product instances from a product family by assembling parts with varying features 
(e.g. color) to suite individual customer demands [9]. In this field, we particularly focus on 
process-specific variations [7] that additionally yield process variety. Relevant robot task 
parameters that may change with process-specific variations are e.g. pickup or placement 
locations, or even the ordering of process steps [1]. 

Visual end-user robot programming is an established approach to cope with such 
variety [6]. Corresponding approaches [14, 17-19] are mostly based on skill frameworks. 
Those let users combine skills with human-readable semantics into tasks (e.g. [15]) even for 
human-robot collaboration [16, 18]. Modularity and intuitive usability support convenient 
(re-)programming and, in consequence, quick adaptation. In contrast, our contribution seeks 
to reduce recurrent programming efforts by applying the visual programming paradigm to a 
task model that intrinsically encodes a subset of feasible variations (e.g. different part types 
or locations) and adapts online (Fig. 1). We hypothesize that this would further contribute 
to the economic efficiency of intelligent robot systems. 

Corresponding task models with variety have also been addressed in literature. Among 
them, especially precedence graphs and hierarchical AND/OR Trees are frequently used 
in intelligent robot systems (e.g. [4, 13, 16]). They seek to encode all feasible assembly 
sequences [10], hence focussing on process variety. Similarly, hierarchical models empha- 
sizing product variety [7, 9, 11], approaches at the intersection of assembly and product 
family oriented goals [5], and ontologies to exchange production data under variety [8] 
have been proposed. They commonly decompose products into functional entities [11] until 
inseparable, constituent components referred to as primary generic products [9] or parts 
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Fig.1 Visual programming enables frequent end-user robot task adaptation to customer demands in 
flexible manufacturing (a). We seek to reduce programming efforts by online adaptation (b). To this 
end, we propose to explicitly encode different situations with product variety (Al, A2) or process 
variety (B1, B2) in a single task model 
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families [7] are reached. A group of feasible variants for assigning a part in concrete product 
instances is associated with each component. Analogously, groups of feasible locations can 
be expressed with spatial relations [14], or more specifically with areas in the workspace 
[18]. Taking inspiration from this group notion for feasible part types and locations, we pro- 
pose end-user programming of assembly task models with skills accepting parts families and 
partly known locations as input. This way, parameters can be partially left underspecified 
at modelling time to create a single task model for several instances of the task. Consider 
e.g. a pick-and-place task that involves fetching five bolts from the imprecise location 
conveyor and putting them into a box—with our approach, a single task model is suffi- 
cient to robustly conduct this kitting task for any positions and orientations of bolts on the 
conveyor, and for any size of bolts. 

Once a skill is executed, one of the physically present entities with precisely known 
parameters as sensed by the robot must be assigned to the symbolic part description in 
the task model. Establishing a link between symbolic parts and the world is referred to as 
the anchoring problem [3]. This in particular includes deciding between multiple sensed 
entities that equally match an ambiguous part description (e.g. bolts of different sizes all 
being of type bolt). Related approaches perform anchoring with local decisions [4, 14, 18]. 
Ambiguity is here resolved in the scope of a skill without considering subsequent process 
steps, e.g. by choosing from all matching entities the one closest to the robot [14], or by 
drawing randomly [18]. However, such decisions can render the overall process infeasible 
(Fig. 2): Despite being suitable for the currently considered skill, an entity may be strictly 
required by some subsequent skill with more strongly constrained input parts. Choosing 
the “wrong” entity will thus lead to an error when trying to anchor this subsequent skill. 
Therefore, we propose an algorithmic procedure with global decisions which considers the 
constraints of all skills during the anchoring process. 

All in all, our contribution is twofold: (i) We propose a task model and visual pro- 
gramming procedure with robot skills accepting parts families and flexible locations rather 
than definitely specified, uniquely identified parts as input parameters. (ii) We show a 
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Fig. 2 Our task models may be underspecified, e.g. by skills accepting any kind of gear (1 and 
2) for adaptation to sensed parts in a world model (a-d). Locally correct anchoring decisions, e.g. 
assigning red_gear c to skill 1, can render the process infeasible when subsequent skills have 
strictly specified input parts (3 and 4) 
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computationally efficient method for anchoring and executing such task models in unknown 
environments with ambiguous parts. 


2 Our Approach 


An overview of our approach is shown by Fig. 3. Users will first use a visual programming 
task editor to create a precedence graph model (Sect. 2.2) capturing different instances of 
the task (Sect. 2.3). After that, the robot workspace is prepared by supplying concrete parts. 
The task model provides partly underspecified information about the types and approximate 
locations of parts to be expected when executing the task (Sect. 2.1). From this information, 
a path to explore points of interest in the workspace with a camera attached to the robot 
hand is calculated. A world model is then built by active vision, i.e. by approaching each 
point of interest and performing object recognition. The world model enables the computa- 
tional process of plan instantiation for the perceived situation in the workspace (Sect. 2.4): 
Detected entities in the world model are assigned to parts referenced in the task model with 
an assignment planner solving the anchoring problem. Together with the task model, the 
resulting assignment solution is passed to a task sequencer. The sequencer applies a schedul- 
ing algorithm to the task model and finishes skill parametrization by replacing underspecified 
parameters with precise information from the world model. The resulting operation sequence 
is finally passed to a skill execution engine. After task completion, further materials can be 
supplied, and the plan instantiation process can be re-iterated starting from the workspace 
exploration step without manually adapting the task model. 


2.1 Part Types and Locations 


We describe parts in terms of their type and location in the workspace. To this end, a part 
type is an entry taken from a tree-shaped part type ontology. This ontology is a required input 
to the approach. It captures “is-a”-relations between a set of nodes O = {01, 02,..., ojo\}. 
Leaf nodes P C O denominate concrete part types as which parts in the physical world 
can be classified. We assume a CAD model given for each o € O for the purpose of 
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Fig.3 Our approach adapts generalized task models emerging from a visual programming procedure 
by means of active workspace exploration, assignment planning, task sequencing, and skill execution 
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grasp and placement planning. When ascending from leaf nodes upwards towards the root 
node, encountered inner ontology nodes encode increasingly generic part descriptions. The 
ontology thus encodes parts families with an increasing level of generalization over part 
types. An example inspired by the benchmark domains used in our experiments is shown 
by Fig.4. Here, different gear and conductor leaf part types are summarized under the 
more general terms gear and conductor. The approach is intuitively adapted to other 
domains by specifying a corresponding tree with several levels of generalized part types. 
Formally, the ontology is characterized by the function is_a: O x O — {TRUE, FALSE} 
with is_a(o, 0’) = TRUE whenever o = o’ or o is a child of o’. In all other cases, is_a(o, 0’) 
is FALSE. 

Regarding the part location, we distinguish two cases: A location can be known precisely 
and, hence, be specified by a rigid body transform “Tpat € R4*4 indicating the object 
translation and rotation with respect to some world frame w. This is e.g. the case for object 
recognition results, for parts provided on workpiece carriers etc. In the second case, a part 
location is not given precisely, but only within a certain tolerance. These two concepts can 
be captured by a unified formalization: Let L = {Iı,I2,...,/jz]} denote a set of locations 
relevant to the task. A location l; € L may describe the precise position and orientation of 
some place where parts are usually located (e.g. the output slot of a parts feeder). Let LP" C 
L denote these precisely known locations, each associated with a rigid body transform 
pose(/;) € R*** (J; € LP"). In addition to these precisely known locations, elements 
of L may also describe a 2-dimensional area on the workbench surface, a 3-dimensional 
volume defining the interior of a box etc. We will see in Sect.2.3 how L emerges from the 
visual programming process. For the planning process (Sect.2.4), each location J; € L is 
associated with a location function is_at;, : O x R*** —> {TRUE, FALSE}. These functions 
are designed to output is_at,(0,” Tpart) = TRUE for a part type o € O and transformation 
“Tpart € R*** if and only if some part of type o with pose described by “Tpart is at the location 
denominated /;. Our system currently supports is_atj, functions for comparing equality of 
precise positions, and for checking whether parts lie in planar workspace areas considering 


root node 
conductor 


inner nodes 


leaf nodes 


Fig. 4 A part ontology tree encodes “is-a”-relations to group different part leaf types into more 
generic type descriptions represented by inner tree nodes 
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their axis-aligned bounding boxes aabb(o) (Fig.5). The formalism allows for integrating 
more complex location specifications in future work (e.g. spatial relations between parts). 


2.2 Task Models with Degrees of Freedom 


Our goal is programming tasks that can be adapted to product and process variety at execution 
time. To this end, we first define the notion of part templates which capture boundary 
conditions that parts used in a task must satisfy. A part template p = (p'YP®, p!°°) combines 
an arbitrary node p'YP® € O from the part type ontology with a location p! € L. It describes 
a part with parameters that are possibly only partly known during the visual programming 
procedure, e.g. a conductor that may be either red, green, or blue and that lies at any position 
within a larger area on the workbench. Part templates enable task models with a certain 
degree of generality regarding part types and locations: In our framework, each task (T, <r) 
is composed of partially ordered operations T = {t1, T2, ..., t\r\}. The partial order <r 
defines assembly precedence relations between operations, i.e. some operation t; € T must 
be done before t; € T (i A j) if and only if t; <r tj. This task model is well known 
from the assembly planning domain [10] and suited for flexible production settings. We 
further describe each operation with a pair t; = (p;i, li) of a part template p; and a part goal 
location J; € L. The model thus covers any sort of operation where a part is transferred to a 
new location by the robot. This comprises basic pick-and-place actions as well as operations 
during which the transfer requires more sophisticated robot control (e.g. force-supervised 
gear meshing, see Sect. 3). 

Task models as defined above are underspecified, and each part template must be anchored 
to a physical entity when the task is executed (Sect. 1). To this end, the robot builds a world 
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Fig.5 Our task editor (left) combines icon-based precedence graph modelling (a) with part creation 
in a virtual workspace (b). The modelling process outputs task models with associated operators to 
compare locations and part types (right) 
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model W = {Pı,..., Pjwj} containing all entities perceived on camera images. Entities are 
encoded by part states. Contrasting to part templates, part states p = (p'YP®, p!°°) combine 
an ontology leaf node p'YP® € P and a precise location pl € LP as detected by object 
recognition. We say that an operation t; € T may be applied to a part state p € W if and 
only if p satisfies the part template p;. Validation of this connection between part templates 
and states is achieved with a satisfies-function (Eq. 1). 


. 2 TRUE if is_a(p¥P°, p¥P°) A is_at ioe (PYP®, pose(p'”)) 
satisfies(p, p) = () 
FALSE otherwise 


2.3 Visual Programming 


Users create task models by interacting with a graphical editor shown in Fig. 5. To this end, 
it is first necessary to specify part templates for each part to be used during the task. A 
new template can be added by choosing its part type and initial part location. The user is in 
charge of selecting from the part type ontology appropriately so that the desired level of task 
generalization is reached. The selection of locations is supported by a virtual representation 
of the workspace. In the virtual workspace, a workspace layout as introduced in our prior 
work [16] offers pre-defined regions to be chosen as part locations (e.g. /4 in Fig.5, left). 
For each area defined by the layout, a location function based on the area corner vertices is 
instantiated and added to the location set L (Sect. 2.1). If the user prefers to specify part poses 
precisely (11, l2, l3 in Fig. 5), additional location functions are defined by corresponding pre- 
cise poses. Having specified all parts, pick-and-place operations may be added. Finally, the 
operations are connected with precedence relations using the icon-based editor component. 
Currently, the system is based on a single pick-and-place skill — suitable control algorithms 
are derived from annotations to the part type ontology (e.g. force-supervised gear meshing 
vs. position-controlled placement of our benchmark conductor parts). Yet further classes 
of skills, e.g. for visual inspection or presentation of parts to the user for collaborative steps, 
can be added in the future. 


2.4 Plan Instantiation 


Having modelled a task with operations T = {Tı,...,tjr|}, users need to prepare the 
workspace by supplying necessary parts to the robot. After an active vision exploration 
procedure (see [2] for an overview of applicable methods), the robot has all detected parts 
stored in its world model W = {?ı,..., Piw}. The next step is solving the anchoring 
problem as introduced in Sect. 1, i.e. mating each part template p; of operation r; with a part 
state pj; so that satisfies(p;, pi) holds. Assuming that the user has provided at least one part 
for each operation (|W| > |T|), this means O(|W |!) possible assignments. Enumerating and 
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testing those to find a valid solution, clearly, is a computationally infeasible combinatorial 
problem even for small |W |. However, we can apply efficient combinatorial optimization 
algorithms to this unbalanced assignment problem, e.g. the well-known Kuhn-Munkres 
algorithm [12] with O(|W|*) runtime complexity: 

Let C = (c;,;) denote a |T| x |W] cost matrix with a row for each part template and a 
column for each part state. Any wrong assignment of pj; to p; is modelled to have infinite 
costs, whereas a correct assignment has no costs, i.e. 


0 ifsatisfies(p;, pi) . : 
Ci j = Pe ie fl... IT. E { WI} (2) 
oo otherwise 


Given C, combinatorial optimization computes an optimal, injective assignment f : 
{1,...,|7} > {1,...,|W]} which minimizes the total assignment costs J`; ci, fü) 
(ie {1,..., |T|}. In our case, f says that part template p; of operation t; must be associ- 
ated with part state p fq) to incur the minimum cost assignment. By construction of C, any 
solution involving a wrong assignment (cf. Fig. 2) leads to infinite overall costs. This means 
in practice that the user has not supplied all required parts to the workspace—in this case, 
our system outputs an error message to inform about missing parts. By contrast, a solution 
f with 0 overall costs means that each part template was matched with a suitable entity in 
the workspace. The process can then proceed to the task sequencing step. 

The task sequencing procedure prepares a fully specified sequence of operations to be 
executed by the skill engine. For each operation t = (p, L), a suitable input entity matching 
p is known from the above assignment f. We further use a grid-based placement planner 
that determines precise part goal locations whenever the operation goal location / is an 
area. Finally, the precedence graph is transferred into a sequence that complies with all 
“earlier-later” relations. The fact that we are using a graph structure as task model opens 
a range of future possibilities here: Aside from searching for an operation sequence that 
optimizes energy consumption or other secondary criteria, planning of collaborative action 
with a human-robot scheduler would also be feasible at this point in the process. 


3 Experimental Validation 


We have modelled four benchmark tasks (Sect. 2.3) which are designed to illustrate specific 
aspects of product and process variety (Fig. 6a): Product variety is represented by task S1, 
in which gears of arbitrary types (red, blue, green, cf. Fig. 4) are assembled with force- 
supervised robot control. Task S2 is a kitting task, where a connector of each type is added 
to a bundle of three. Tasks S3 and S4 replicate assembly tasks of electrical circuits with a 
serial/parallel connection. The tasks S2-S4 use region-based initial locations, thus enabling 
convenient part feeding by the user. Task S2 furthermore allows for the bundle to be placed 
anywhere within an area. We have executed each task with different workspace configu- 
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rations (e.g. S1 with different part types, S4 with orderly or arbitrarily placed connectors, 
cf. Fig. 1). Online adaptation and task execution in these differing settings was achieved 
successfully. 

Moreover, a theoretical comparison of the effort needed for adaptation with our approach 
versus the traditional re-programming method was conducted. We say that a production 
cycle consists of executing a task N times, i.e. finishing N instances of a product. By intro- 
ducing the flexibility demand ratio FD = m we characterize the manufacturing setting, 
i.e. traditional mass production with hardly any adaptations for FD — 0, decreasing lot 
sizes for FD — 1, and one-off products for FD = 1. The adaptation effort per cycle of our 
approach depends on N, as each program execution is preceded by exploration and assign- 
ment planning—re-programming effort is not required during a cycle as the task models 
for S1-S4 have covered all necessary adjustments. During the experiments with our bench- 
mark tasks, an exploration time of about 9s was measured whereas the planning time was 
negligible. By our definition, the effort per cycle for adaptation by visual re-programming 
is independent of N and therefore constant. However, the re-programming time including 
loading and saving the task model depends on the degree of necessary changes. We have 
considered three cases where only one operation or corresponding part (minimum effort); 
half of the involved parts (medium effort); or all parts (maximum effort) need to be adjusted 
in the task model between consecutive cycles. Representative durations of these three re- 
programming types have been gathered by observing an expert operate our task editor (min. 
~ 31s; med. © 80s; max. © 110s). 

Figure 6b compares the time allocated for adaptation within a production cycle depend- 
ing on FD. In general, our approach achieves better results than manual re-programming in 
highly flexible domains, i.e. for higher FD values, since task variants are widely encoded in 
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Fig. 6 Our experiments comprise different benchmark tasks S1-S4 (a, goal states are rendered 
transparently). Adaptation time measurements enable a comparison of our approach and manual 
re-programming for different lot sizes (b) 
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the task model. In particular, it performs better for lot sizes of three or less, even when consid- 
ering re-programming with minimum effort. In other words, less adaptation effort is needed 
with our approach compared to manual re-programming for finishing three products—this 
confirms our hypothesis regarding economical efficiency (Sect. 1). For medium and max- 
imum re-programming effort, this amortization threshold shifts towards larger lot sizes. 
However, the effort for exploring the workspace before each task iteration renders re- 
programming more efficient in mass production settings with relatively few changes. These 
quantitative results must of course be interpreted within the limits of our benchmark tasks. 
Yet, our analysis illustrates qualitative relationships that are transferable to other scenarios 
and applications. 


4 Conclusion and Future Work 


In this paper, we have contributed a visual programming and robot task execution approach 
that incorporates product and process variety. For this, part templates are specified as input 
to robot skills in terms of approximate locations and generalized parts families. This leads to 
partly ambiguous, underspecified task models capturing a set of task variants. Adaptation to 
concrete parts is achieved online by workspace exploration and combinatorial optimization 
to anchor ambiguous part templates to perceived concrete parts. Our experiments with a set 
of characteristic benchmarks show how this approach helps to reduce the (re-)programming 
effort of robots in flexible manufacturing settings. 

We will address several limitations of the approach in future work: Currently, the task 
structure and number of processed parts are fixed. Further task variety could be achieved 
by augmenting the task model with constructs as loops for situation-dependent repetition 
of operations. Furthermore, we will extend the approach towards human-robot co-working 
by integrating multi-agent scheduling. Finally, our concept needs a comparison with other 
visual programming systems to evaluate the impact of generic part and location descriptions 
on usability. 
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and Workpiece Registration 


Edgar Schmidt, Pascal Ruppert and Dominik Henrich 


Abstract 


The ability to review robot programs before they are executed can be used to correct 
erroneous programming. In complex processes, such a review can only be achieved by 
integration of additional peripheral devices and workpieces used into the programming 
framework. In this work, we present a semi-automatic method for the calibration of a 
turntable and a workpiece registration based on the turntable calibration utilizing only a 
DOE laser. The turntable pose is calculated by approaching markers on the turntable on 
a so called acquisition plane. Based on the calibration, workpieces are registered with 
intersection points between the laser beams with the turntable rotary plate. We evaluated 
our approach in terms of accuracy and the amount of time for execution. The resulting 
poses of the turntable and workpieces are used to load models into the 3D simulation of 
the programming framework and thus can be used to review the robot program. 
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1 Introduction 


In human-robot collaboration, humans and robots work together on tasks, or a human solves 
tasks by operating the robot, so that the advantages of both parties are combined. For example, 
in fibre spraying processes a spray gun is attached to a robot and the robot is programmed 
through direct guidance, so that the process knowledge of the operator is combined with 
the robot’s precision and repeatability [1]. This approach, called playback programming or 
kinesthetic programming [2] is intuitive, since no robotics knowledge is required. 

An useful extension to this programming approach is the editing of the robot program 
after recording the robot motion and before execution [3]. For instance, it is possible to cut 
out errors in the robot trajectory or to insert repetitions or branches. The editing concept 
can be extended to simulate the robot path with the goal to review and especially avoid 
incorrect robot programs. For this, the setup with all relevant objects must be known in the 
programming framework. 

In this work, we present an extension to a kinesthetic programming framework [4], such 
that it is possible to calibrate an external turntable in relation to the robot and add it into 
the 3D simulation of the programming framework. Based on the calibration, the user can 
register workpieces in the 3D simulation of our framework. With this, it is possible to review 
robot trajectories in the 3D simulation of the framework as shown in Fig. 1. 

Section2 gives an overview of sensors and algorithms that can be used to calibrate 
peripherals and register objects. In Sect.3, we describe our approach for the calibration of 
a turntable. Section4 describes the method for registration of workpieces. Our extensions 
are evaluated in terms of accuracy and execution time in Sect.5. Section6 summarizes and 
concludes the paper. 


Fig. 1 Given a setup with a turntable and workpiece (left), our goal is to know this setup in the 3D 
simulation of our programming framework (right) 
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2 Related Work 


If a robot has to work with or on objects, it is necessary to localize these object and to create 
a virtual representation. For this, different sensors can be used. Multiple colour and depth 
cameras are commonly used to monitor a work cell [5]. Colour or colour depth cameras 
attached to a robotic arm are used for pose estimation and motion planning [6] and laser- 
assisted welding systems utilize already a digital industrial camera combined with a laser 
[7]. Nevertheless, camera based method need to handle changing lighting conditions [8] 
and motion blur [9]. Ultrasonic sensors are used to detect empty paint cups [10] and in 
outdoor applications, such as autonomous driving, radar and lidar are commonly used in 
case of lighting extremes such as day and night [11]. In our application, fibre spraying, only 
minor lightning changes occur and the process produces air pollution [12]. This results in 
an additional effort for cleaning and calibrating sensors. As of this, the used sensor should 
be as simple as possible. 

Most of the challenges of the current problems of object localization are not relevant to 
our application. As mentioned, highly changing lighting conditions are not to be considered. 
Also, an approach which can handle a variety of non-stationary objects in a large scene is 
not required [13]. Additionally, hard real-time requirements are not given and so a fast 
computation is not needed [14]. Consequently, we decided to use an approach which is not 
facing these problems. 

Insofar, a semi-automatic process in which a user can calibrate a turntable and register 
workpieces using a low cost DOE laser is sufficient. Thus, there is no need for a complex 
object localization or additional sensors. Further, a DOE laser can be used as a visual aid 
for playback programming in general. The only drawback of such a method is the accuracy 
of the procedure as this depends on the manually performed steps. 


3 Turntable Calibration 


We present a semi-automatic procedure with which a user can determine the complete pose 
of the TT in reference to the robot, so that the turntable (TT) can be included into the virtual 
simulation. After defining general terminology and functions for the calibration, we explain 
the calibration procedure in detail. In our case, the tilt of the TT is set manually before 
calibration, which results in the need of a explicit calculation of the tilt angle. In the case, 
that the tilt could be controlled and set by software, the presented method is simplified. 


3.1 Definitions 


We define the origin of the world coordinate system in the base of the robot and as a right- 
handed system. A pose P = (T, O) in our world consists of a position T := (x, y, z) with 
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x,y,z € Rand an orientation O := R?*3, The set of all poses is called P. The extension 
.x, .y and .z is used to reference x, y or z coordinates of a pose. Further, the plane / := 
{(x, y,Za) | x,y € RA za = const € R} is called acquisition plane. The smallest angle 
between .& and the laser beam mounted on the robot is called the laser angle à € ]0°, 90°]. 
The limitation of X is given by the structure of the used robot cell. 

We define a set of markers M := {mo,m,,m2,m3,ma4} with m; € T on the rotary 
plate of the TT. The marker m4 is located in the centre of the rotary plate. The remaining 
markers are set on a circle with a known diameter D around the marker m4 and are indexed 
clockwise, so that ellipse semi-axes are defined by [mo, m2] and [m 1, m3], as shown in Fig. 2 
left. Further, a set of robot poses FY := {po, pı, p2, p3, pa}with p; € P is given, which 
are obtained by pointing the laser beam at the markers m; on a fixed æ with a fixed laser 
angle A. 

A turntable configuration TTC := (g, Tr, Os) consists of a tilt angle  e [0°, 90°], 
a turntable position Tr, in which the mounting height zr is known from the structure of 
the robot cell, and a turntable orientation Os := {left, down, right, up}. The orientation 
Os is simplified, since the TT can only be tilted to the left, down, right and up, which 
corresponds to rotations of 0°, 90°, 180° and 270° around the z-axis of the TT, as shown in 
Fig. 2 right. The intrinsic parameters of the TT are known. Thus, we can use the function 
T: x Os > Tr, which returns the translation between the centre of the rotary plate and 
the TT position based on the tilt angle ọ and the orientation Os. 


3.2 Calibration Method 


Before each calibration, a reference run of the TT is performed, which guarantees correct 
marker positions on the turntable. Then, the user approaches the robot via hand guidance, 
so that the DOE laser beam targets at the markers ./ on the rotary plate. The rotational 
axes as well as the translation in z is locked while guiding the robot, defining <7. The num- 


p = 40° 
Os; = {up} 
9-2 
Os = {right} 
= 40° 
= {left} = 40° 
= {down} 


t 
a 


Fig.2 The DOE laser beam is used to point on marker from robot positions on the acquisition plane 
(left). Different mounting orientations of the turntable have to be considered (right) 
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ber of markers on the turntable is minimal, as m4 serves as redundancy and can be added 
optionally. Marker mo is highest point of the circle on which the markers are defined and 
the farthest away from the tilt axis. During the calibration, the markers are approached in 
ascending order of their indexing (clockwise) and the corresponding robot poses Y on æ% are 
stored. Based on the recorded Y, the turntable calibration TTC can be calculated as follows. 


Orientation: As of the structure of the possible orientations, the lines segments [ po, p2] or 
[pı, p3] are perpendicular to the world x-axis and can used to determine the orientation. For 
this, we first compare the y-values of the p; to choose which line segment is perpendicular to 
the x-axis and then consider the values of the y-coordinate (down and up) or the x-coordinate 
(left and right): 


down, if |po.y — p2.y| < €^ pı.y < p3-y 
up, if |po.y — p2.y| < € A pı.y > p3.y 
left, if |pı:y — p3.y| < € A pı.x < p3.x 
right, if |pi.y — p3.y| < € A pi.x > p3.x 


O; (P) := (1) 


We allow inaccuracies in approaching the markers with the threshold £ « 1. The equation 
also holds, if po and p2 are the same point, which occurs if the tilt and the laser angle both 
are 90°. 


Tilt angle: The tilt angle ø is determined with po and p2. Different cases are considered 
according to the given orientation O,. In case of up and down, i directly influences the 
distance between po and p2. For left and right, the calculations are independent of A. 

down: The distance dy := |po.x — p2.x| is projected from æ% along the laser beams to 
the rotary plate, as shown in Fig. 3 top. By using A, dx and D, a triangle can be constructed. 
Thus, we can determine ¢ by the ratio of the vertices dy and the distance D, where dy 
depends on both g and A: 


o = 180° — A — arcsin (5°) (2) 


up: Analogous to down, a triangle can be constructed, but in this case A does not occur in 
the triangle directly, as shown in Fig. 3 bottom: 


(3) 


dy - sin(180° — >) 


= i — arcsin 
0 (=F 


left andright: The markers mo and mz are not on a parallel line to the laser beam. However, 
the distance dy := | po.y — p2.y| is independent of A. As of this, g can be calculated by the 
ratio of the recorded distance dy and D: 
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Po p2 A 


í 


Fig.3 Derivation diagram of the tilt angle calculation for the cases Os = {down} (top) and Os = {up} 


(bottom) 
dy 
gy = arccos ( 2) (4) 


Position: We move the TT with respect to the emanated laser beams of Y and the TT 
mounting height zr. Let M be the centre of the recorded positions Y. Then the turntable 
position Tr can be defined with the laser angle A, the mounting height zr and the function 
T(p, Os): 

T (P, Os) .2— ZT 


Tr := M+ tana) a i +T (g, Os) (5) 


Thus, the turntable calibration TTC is achieved with Eq. | for the orientation, Eqs. 2, 3 or 4 
for the tilt angle and Eq. 5 for the position. 


4 Workpiece Registration 


The workpiece registration is also tied to the DOE laser and moving the robot to specific 
points. In contrast to the turntable, there are no markers defined on the workpieces from 
outset. To deal with this, we use points which are located on the workpiece surface resting 
on the rotary plate. With this points we define correspondence pairs between the workpiece 
and the rotary plate. 
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For each correspondence pair, the user defines first a relevant point on the digital surface 
shown in the graphical user interface of our framework. Then, the robot is guided by the 
user, so that the DOE laser points to the corresponding point in the real world. This position 
is stored and the system calculates the intersection of the laser beam with the upper surface 
of the rotary plate from the stored robot pose as second point in the correspondence pair. 
There should be at least three correspondence pairs, since we define the contact plane to the 
rotary plate. After the definition of the correspondence pairs, the transformation between 
object and TT is calculated with pose estimation using singular value decomposition [15]. 


5 Experiments 


We evaluate our approach in terms of accuracy and execution time for both proposed exten- 
sions. For this, we use the cell proposed in [1] and extend the programming framework 
presented in [4]. In terms of intuitiveness, a simple wizard-based UI was designed for the 
TT calibration (Fig. 4 left). Also, itis possible to save and load calibrations. For the workpiece 
registration, we implemented and compared two approaches: A laser-assisted registration 
method as proposed in Sect.4 and the definition of correspondence points by clicking on 
the upper surface of the TT, so that the laser is not required (Fig. 4 right). 


5.1 Accuracy of Turntable Calibration 


We performed the calibration on various tilt angles g. In the context of fibre spraying, 
g = 0° and g = 90° are most relevant. Additionally, we used different g € [0°, 45°] to 
show the accuracy for general tilt angles. Overall, 30 calibrations were performed: Eleven 
with O, = left and O, = down and eight with O, = up. We neglected O, = right, 
since the calculation is identical to left. The ground truth position of the TT was gauged via 


Fig.4 After turntable calibration in a wizard-based UI dialogue (left), a workpiece is registered via 
a picking method (right) or a laser-assisted method 
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Fig. 5 Results of the turntable calibration experiment. Left: Error in position x [cm]; Centre: Error 
in position y [cm]. Right: Error in tilt [°] 


measuring tape and the orientation was measured with a gyroscope sensor with an accuracy 
of +0.06°. 

Figure 5 shows the results of our evaluation. The error in the x-axis is consistently smaller 
than in the y-axis, and both errors scatter in the same size, resulting in an error range of 
up to 1.5cm—neglecting the two outliers. For most calibrations, the tilt angle error is in 
[—0.2°, 1.1°]. The error of the tilt angle g is smallest for O, = left and for O, = up a 
similar good accuracy is achieved, except one outlier. In case of O; = down, there is a 
swing to the negative and two calibrations which failed completely with a tilt error of 10.9° 
and —22.5°. It is not clear what exactly caused these two failed calibrations. 

In Summary, the test cases for up and left have, on average, a tilt angle error in the 
sub-degree range and an average position accuracy in the sub-centimetre range. Although 
a slightly larger error rate for down, there is overall an accurate result, even though the 
calibration requires only a minimal number of markers. 


Fig.6 Classes of workpieces for the registration evaluation: Defined corners (left), Rotational sym- 
metric (centre) and workpieces whose contact surface to the rotary plate can not be targeted with the 
laser (right) 
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5.2 Accuracy of Workpiece Registration 


We evaluated the cases of g = {0°; 44°; 90°} for three different orientations Os = 
{left; up; down}. Further, we distinct workpieces in three different classes. Workpieces 
with defined corners (hexagon, Fig.6 left), which can be easily clicked and approached. 
Rotational symmetric workpieces (tube, Fig. 6 centre), which have not clues for point def- 
inition like corners. In addition, workpieces are considered whose contact surface to the 
rotary plate can not be targeted with the laser (shell, Fig. 6 right). We performed the regis- 
tration 50 times, with the two mentioned different acquisition methods (laser-assisted and 
picking) and used a different number of correspondence points each time. The workpieces 
are bolted in the centre of the TT, so that the position of the workpiece is exactly known 
with respect to the TT position. Since the definition of the relevant points is left to the user, 
a presumably optimal choice of points is assumed. 

We measured the error of the centre position of the workpiece, as well as the error of the 
contact surface, which is the surface containing the relevant, and the rotation error around the 
normal vector of the contact surface. Figure 7 shows the accuracy results of all performed 
registrations. In general, all registrations perform in with a sub-centimetre accuracy. All 
workpiece classes have a similar spread in the sub-centimetre range and the registrations 
that utilize more correspondences are more accurate. Regarding the orientation, the contact 
area matches often perfectly. However, there is up to 1.5° contact area error for the shell, 
independent of the used registration method. The largest outliers in the rotation error are 
due to use of the picking method. The remaining data points cluster in a range of [—2°, 2°] 
rotation error. 

The registration with picking is more accurate than the laser-assisted approach. One 
reason for this is that workpieces are always mounted in the centre of the rotatory plate and 
it is more easy to click the points accurately, as the centre can be found relatively well by 
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Fig.7 Results of our workpiece registration evaluation. a) — b) Error in translation [cm]; c) Error in 
contact plane [°]. d) Error in rotation [°]. The overall results are shown with subsets characterized 
by the used registration method 
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human eyes. We also notice a slight worse accuracy when tilting is present. Considering 
that our ground truth data is subject to error, the accuracy of the position is independent of 
the orientation and tilt of the turntable, as well as the recording method. However, using 
clearly defined corner points on the workpiece, as well as using a larger number of relevant 
points, improve the accuracy of the method further. The orientation of the registration is most 
accurate when the laser assisted method is used. The contact surface rotation is error-free 
for the tube and hexagon and even the ~ 1.5° contact plane rotation is still usable within 
our application fibre spraying process. 


5.3 Execution Time 


We measured the amount of time to perform the turntable calibration and the workpiece 
registration during the aforementioned experiments. The turntable calibration time is defined 
as the time span between opening the wizard for the calibration and loading of the turntable 
3D model into the simulation with the calculated calibration. In case of the workpiece 
registration, we measured the time from starting the UI dialogue for registration until the 
object is shown in the simulation window. 

Most of the calibrations are performed in less than 150 s and need minimal 87s, as shown 
in Fig. 8. The outliers result in case of A = 90°, at which the approaching with hand guidance 
is more strenuous. In case of the registration, differences are seen in the used method. The 
laser assisted method requires a comparable amount of time to the calibration and the picking 
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all 
Workpiece Registration 
Duration [sec] 
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Fig.8 Measured execution time for calibration and registration [sec] 
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method is performed with less amount of time. The wide range in the registration duration 
result from the varying number of correspondence pairs. On average, the amount of time 
rises with an additional correspondence pair by ~ 10s for picking and by ~ 20s for the 
laser assisted method. 

The picking is carried out more efficiently because of the workpiece mounting and as it 
is performed only in the GUI. Overall, a fully known setup with workpiece and turntable 
is achieved in less then five minutes. A Further reduce of this amount of time is possible 
through regular application. 


6 Conclusion 


We present a method for the calibration of a turntable and the registration of workpieces 
based on the calibration. Our method utilizes, in contrast to the current trend in research, 
not several visual sensors. Instead, we utilize only a DOE laser attached to the robot. The 
DOE laser is used as a visual aid in approaching individual points, which are used for 
the determination of the turntable and workpiece poses. For successful calibrations, the 
translation error scatters in a 1.5cm interval, while the tilt is in a range of [—3°, 3°] off 
the measured angle. The workpiece registration accuracy is in the interval of [—1, 1] cm. 
The rotational error of the registration is in the range of [—6.6°, 3.4°]. For both methods 
combined, five minutes of time effort is needed. The approach can be used in the production 
of small batches, which requires in worst case a new registration for each program execution. 
In future work, the known objects are used to simulate fibre spraying processes. With this, 
robot trajectories can be verified for correctness before execution and, if needed, optimized 
to reduce production costs. 
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Digital Geometry Recording 
for Automation of the One-Off 
Production of Pipes 


Jacques Biltgen, Sascha Lauer, Martin-Christoph Wanner 
and Wilko Flügge 


Abstract 


The manual production in pipeline construction is often related to the fact that either 
the capacities for automation are lacking or the production is too individual for a sim- 
ple automation solution. However, automated production would increase productiv- 
ity and quality, especially in metal processing. The challenge in the manufacturing 
process of fitting tubes is the batch sizes of one. Nevertheless, a non-time-consuming 
programming solution must be found to integrate a robot-based solution economically 
into the production chain. Offline path planning based on CAD models would be a 
suitable solution. To ensure that the robot-welded seams comply with the standards, 
there has to be consistent quality in the seam preparations. For the final quality the 
direct integration of the CAD flow would be important. Due to transport and limited 
space, it is often impossible to use sensors to scan the piping. In case of lacking tech- 
nical documentation, the pipes are still measured by hand, especially when replac- 
ing or modifying pipelines. The geometry survey is done in several steps, first a rough 
drawing is made on the vessel, this is then converted into a technical sketch and only 
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later transferred to a CAD programme. There can be several days between these steps 
and different operators. A method will now be presented to combine these steps with 
the support of a tablet. For this purpose, software is used on the tablet to digitise the 
geometries and prepare them for further offline path planning. 


Keywords 
Digitalization - Robot welding - Process optimization - Pipe production - 
Robotics - Automation 


1 Introduction 


In pipeline construction, especially in repair, fitting pipes play an essential role. The 
manufacturing steps are mostly carried out manually. The challenge with robot-assisted 
automation is the small batch size, with mainly one-offs being involved. Since the selec- 
tion of standard components is defined and recurring components are used, robot-sup- 
ported automation is nevertheless possible. To be able to integrate this profitably into the 
entire manufacturing process, the upstream and downstream processes must be consid- 
ered in addition to the welding process. 

For an economic integration of a robot-based solution, the weld seams have to be 
produced with a consistent quality. To achieve this, the upstream processes have to be 
adapted. In this case, it is advisable to collect data of the pipe geometries during the 
manufacturing process and to integrate it into the path planning. In this way, even com- 
plex unique structures can be welded with a robot. 

In best case the CAD data are integrated directly into the manufacturing process to 
implement a follow-up of the data and the quality. The digitalization of pipes is already 
available on the market in various design software packages [1-3]. These are based on 
the new planning of plants and are therefore unsuitable for integration on the construc- 
tion site. A direct and intuitive integration of the technical data into the entire production 
process is not practicable. Moreover, additional measurement equipment is required for 
a detailed representation of the plants. In the case of pipe rehabilitation on ships, this is 
unsuitable due to the time-consuming transport, limited accessibility, and varying light 
conditions. Digital Mockups (DIM) are essential in the steps of planning, construction 
and accessibility analyses. They are used especially in aircraft construction [4], where a 
scan of the fuselage is carried out and transferred to CAD. Furthermore, there is the pos- 
sibility to build up the DIM from CAD data. Another advantage of these mock-ups is the 
comparison of target geometries (CAD) and actual geometries. Due to the high effort, 
these procedures are mainly carried out for large batch sizes. Due to the lack of docu- 
mentation, the long journeys and very small premises, scanning the pipes is very work 
intensive and is therefore not carried out. 
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A direct integration of the technical data already on site is necessary to improve the 
quality and the expenditure of time during the entire process chain [5]. Due to these defi- 
cits, a methodology was developed which enables the integration of CAD data into the 
process chain on the construction site. 

Currently, a rough hand sketch of the pipe construction is made at the construction 
site and photos are taken for documentation purposes. After this, a technical sketch is 
derived based on this manual sketch, which is then transferred to the CAD worksta- 
tion. The CAD design and a list of parts are derived to order the components. Due to the 
repeated drawing and changing staff members, errors and mismatches can occur, which 
can lead to deviations in the required fitting tube. Since the CAD model is not directly 
integrated into the process chain, these deviations aren’t detected until the end of the pro- 
cess chain. This fact leads to an increased effort in reworking. 


2 Approach to Automation 


To be able to realize an automation of the manufacturing process by means of a robot, 
the entire process chain, the manufacturing process and the material spectrum has to be 
analyzed. Here, the welding of the pipe segments, which has been done manually, has to 
be optimized by a robot. To achieve the required accuracies for this step, the upstream 
process steps have to be analyzed and adapted if necessary. 


2.1 Analysis of the Manufacturing Process 


In the case of one-off production, as is the case in the production of fitting pipes, an 
exemplary the successive process chain includes the following work steps: 


e cutting the individual pipe segments 

e seam preparation at the pipe ends 

e pre-positioning of the pipes in relation to each other 
e spot welding as preparation for welding 

e welding the joints with a robot. 


The main part in the process chain is taken up by the programming of the welding robot. 
Therefore, it’s important to implement a fast and intuitive integration of the used compo- 
nents in the postprocess and to counteract time-consuming programming. When integrat- 
ing standard parts like pipe bends, branches, or flanges, predefined libraries should be 
used to reduce the programming effort, which can be extended and adapted afterwards. 
Due to the complexity of the components an offline based program is defined through 
individual CAD data. 
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To realize welding automation by means of a robot, the quality of the upstream 
manufacturing steps have to be improved. These production steps include cutting, pre- 
positioning, seam preparation and tacking. The direct integration and use of CAD data 
guarantee the required quality. 


2.2 Analysis of the Material Spectrum 


For the realization of automation, the material spectrum must be considered. This term 
includes the geometries and the materials. Figure 1 shows pipe constructions, which are 
used for repairing work on ships. Standard components with individual pipe lengths are 
used. 

For the welding process, the material and the weld preparation are equally important. 
The most common joints are butt joints. In addition to these, there are pipe branches in 
various configurations. For this, offline path planning is indispensable, as manual pro- 
gramming of path support points is too time-consuming. The use of optical sensors in 
online programming can lead to problems with triangulation due to the different angles. 


2.3 Analysis of the Production Chain 
The manufacturing chain can be divided into three main sections: 


e The upstream manual processes, 
e the partially automated joining processes and 
e the downstream processes. 


The upstream processes include the determination and acquisition of the geometry data 
on site. In the partially automated welding process, the steps described in Chapt. 2.1 are 
considered. The downstream processes include sawing, surface treatment, straightening 
and quality control. 


une 


Fig. 1 Examples for pipe components 
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The upstream processes are time-consuming and offer a lot of potential for savings. 
Likewise, this process has a great influence on the final quality. 


2.4 Derivation to Automation 


In the previous chapters, the current state was explained and the process steps that have 
potential for optimization were shown. The resulting information is now to be used to 
derive a digitalization. The digitalization should be designed in such a way that a direct 
communication between the recording of the geometry data and an automated solution 
is possible. A functional concept is to be developed from the previously derived require- 
ments. 

It has been found that for a robot-based welding concept in a one-off production, 
offline path planning on CAD data is the most sensible solution due to its high flexibil- 
ity. However, the generation of CAD data can be error-prone due to multiple processing 
steps and thus has a major impact on automation. To avoid these multiple steps, the geo- 
metric data should be recorded in final form on the construction site. Since the compo- 
nents used for fitting pipes are standardized, the use of a mobile operating device with a 
database of the standards component is the obvious choice. Figure 2 shows a comparison 
of the conventional process and the automated process. A specially developed software 
allows different processes to be combined on a smartpad. 

The smartpad takes over the creation of sketches, the taking of photos and notes, the 
recording of geometries and the different components with their specific values. The use 
of a digital solution eliminates the duplication of drawings. Another advantage results 
from the fact that all elements are already defined on site and misinformation is mini- 
mized by simplified sketches and parts lists. 

This makes it possible to integrate the CAD workstation directly into the production 
chain and reduce the process throughput time. By integrating the CAD data, further pro- 
cess steps can also be partially or fully automated. In addition to automated welding, 
these include cutting and seam preparation of the pipes. 

Due to their structure, programs for documentation purposes have programmatic hier- 
archies. These can be used to create a defined data structure. Through a targeted query 
of customer data as well as order data within the program structure, the projects can be 
structured and systematically stored. This systematic approach simplifies the traceability 
of data and increases quality. An important step for digitization is the definition of the 
most important information. 

The CAD data generated this way is then used for path planning of the robot. A simu- 
lation tool is used to generate this data. In this tool, the seams can be selected, and the 
required paths are generated. Thus, collision control and control of the welding paths are 
already possible. The next step is the transfer of the generated welding paths to a robot 
cell. 
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Fig. 2 Comparison of the conventional to the automated process chain 


3 Implementation of the Developed Process Chain 


The developed concept was integrated into the existing process chain. Two different pro- 
grams were developed for the integration. The first program is an app on an Android 
tablet, where the technical data of the designed pipe constructions are entered. Ideally, 
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the app guides the operator systematically through the raw design. These information are 
used in the production chain to optimize the process. For this purpose, a second program 
is used, which is integrated in a CAD tool. This program creates 3D models and techni- 
cal documentation from the entered data. In order to test the application possibilities of 
the developed software, a field test was carried out and evaluated. 


3.1 Operating Concept on the Smartpad 


To implement the digitisation of the pipe constructions already on the construction site 
without additional devices, a software for a mobile operating device was developed. This 
enables an operator to create a pipe construction from several pipe segments. During 
the input, the pipe construction is displayed according to DIN EN ISO 6412-2 [6] (see 
Fig. 3). This type of representation allows the operator a standardised visualisation. A 
database of standard components is stored in the software. With the help of this database 
standardised components plus their technical parameters can be entered for the fitting 
pipes. The complexity and size of the fitting pipes are not limited by the software. 

To create a pipe, the starting point must be entered in the input field for the start coor- 
dinates (7) and the end point of the respective pipe segment must be entered in the input 
field for the end coordinates (8). The pipe diameter and the desired wall thickness must 
be selected in the control panel (3). Then the weld fittings can be defined in the selec- 
tion window for weld fittings (flanges, heads, tees, and reducers) (6). The program auto- 
matically selects the correct fittings for the selected pipes. Then the segment is added 
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Fig.3 User interface for digital geometry input 
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to the current pipe via the “Hinzufügen” (“Add”) button (2). The pipe segment is listed 
in the table (4) and shown in the drawing (5) according to the standards. To enter con- 
nected pipe segments more quickly, the corresponding start option can be selected by 
choosing the input mode (9). There are several options that make more complex entries 
more user-friendly, e.g. to connect a pipe segment to a T-piece. Additional information, 
such as surface treatments or pipe shapes, can be specified using the input fields for addi- 
tional information (10). Additional design options and layout options are available in the 
option-buttons (1). New projects and sub-projects are created via this button. There is 
also an option for notes and photos. The advantage of the user interface designed in this 
way is the intuitive usability using the real occurring workflow. 

After all the necessary data has been entered as described above, the software creates 
a parts list. This can be displayed and edited. The software always selects standard com- 
ponents for creation. However, if other components are needed in the project, they can 
be added or changed in the parts list. Figure 4 shows the editing of the parts list using 
the example of the pipe bends. Finally, the parts list is saved as a CSV file. The parts list 
is intended as a source of information for the following software in the CAD tool and is 
therefore not usable without further processing. 


3.2 Integration into the Process Chain 


The aim of digitalization is to increase productivity and quality by capturing the pipe 
geometry already at the construction site. After the pipe construction has been cre- 
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Fig.4 Display and editing of the parts list 
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ated with the mobile operating device, the software creates a project-specific folder in 
which all images, the CSV file and notes are assigned to the construction. The folders 
are assigned to defined customers and projects and can also contain sub-projects. The 
automated creation of the folder structure allows for easy traceability and editing. The 
prepared CSV file can be loaded into a CAD tool by saving a 3D-model and technical 
documentation such as parts lists and material lists with all the required information. 
Figure 5 shows the 2D visualization in the app and a 3D view in a CAD tool. 

The conversion into a 3D model takes place automatically in the CAD tool. The 
standard components used are defined by libraries. The lengths of the individual pipe 
segments are formed depending on the total lengths entered and the specific weld fit- 
tings. After creating the 3D model, the required pipe lengths and a quantity structure are 
exported to the operator. 

The 3D model can now be integrated into an offline path planning. The MotoSIM 
tool [7] from Yaskawa was used in the trials. The individual seams can be selected in the 
tool and the set parameters (torch angle, angle of the positioner, approach, and departure 
paths, etc.) can be checked. Afterwards, the check for collision and reachability can be 
carried out. An example of such a setup can be seen in Fig. 6. 


3.3 Evaluation 


In the methodology described, a digital concept was developed that shortens an existing 
process chain and improves the data exchange of individual processes. The digital solu- 
tion has already been tested in field trials as part of a project. Simple pipe connections 
can be created intuitively and quickly via the tablet. 

During the field tests, a reduction in processing time was observed using the software. 
Especially the representation according to DIN EN ISO 6412-2 [6] showed great poten- 
tial in the field tests. The pipes to be replaced could already be matched with technical 
drawings here. 


Fig.5 Creation of a 3D view and technical documentation from a technical drawing 
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Fig.6 Path planning on a digitalized pipe using offline programming 


The field tests were realised up to the CAD workstation. The pipes could be trans- 
ferred into a 3D model without any incidents. The CAD tool used for the verification 
was AutoCAD [8]. In AutoCAD, step files could be automatically generated and out- 
put, which could be integrated and processed in offline path planning programmes and in 
CNC processing programs. In the development phase only tubes in thin sheet metal were 
processed. The generation of the 3D models still required a lot of computing power in 
some cases. 

Challenges arose in the representation of pipe nodes. The intersections between the 
pipes could not be modelled correctly due to complex interdependencies. Due to miss- 
ing dependencies between individual components in AutoCAD, the pipe cannot be repre- 
sented isometrically. 

All in all, it could be determined during the field test that the upstream work steps can 
be automated in an economically efficient way. 

Integration into the simulation software MotoSIM was carried out under laboratory 
conditions. During the tests with the simulation software, the complexity of the seams 
and the pipes became apparent. The paths created by the software had to be checked and 
adjusted frequently for the appropriate welding position. The problem here was that a 
two-axis positioner was used instead of a three-axis positioner. 


4 Conclusion 
It could be proven that the digitalization of the fitting pipe production is already possi- 


ble on the construction site. This is done without additional measuring equipment. Auto- 
mation saves time by eliminating additional individual steps and increases quality by 
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improving the traceability of the data. The data generated by this method can be trans- 
ferred to various CAD tools. 

The next steps are the continuation of the digital process chain, see Fig. 3. Currently, 
the steps up to the technical documentation have been considered. The transfer of seam 
preparation for automated welding will be a further step. The model data can already 
be entered into conventional CNC and path planning programs. However, a comparison 
between the actual and target model is desirable. The pipes are subject to manufacturing 
tolerances and have a conicity and wall thickness deviation [9]. This results in new chal- 
lenges for the automated welding process. 

An automated planning of the seam preparation for the thick sheet area has not yet 
been implemented. The integration of an automated seam planning would be a further 
optimisation possibility. When calculating the individual intersection contours, the differ- 
ent angles of attack, the tube diameters, the material thicknesses, and the seam prepara- 
tion must be considered [10]. 

The integration into an offline program was tested. The next step here is the transfer 
of the welding tracks to a real robot cell. 
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Detection and Handling of Dynamic Scenes During 
an Active Vision Process for Object Recognition 
Using a Boundary Representation 


Dorian Rohner, Johannes Hartwig and Dominik Henrich 


Abstract 


For robot manipulators, it is nowadays necessary to know their surroundings. This knowl- 
edge consists at least of a world representation with recognized objects. During the recon- 
struction of scene objects from multiple views, changes, like positioning of the objects, 
or additional unwanted signals, like parts of a human co-worker, may occur. In this paper, 
we classify the possible changes for a specific type of representation (boundary represen- 
tation models). Afterwards, we present an approach to detect and handle these changes 
to maintain a valid world model. To achieve this, we compare what should be visible 
in the world model reconstructed thus far with the actual information from the current 
view. The detected change is handled by using object hypotheses as well as geometric 
information from the world representation. Based on an evaluation, we show a proof of 
concept and the usefulness of our approach and suggest future work. 


Keywords 


Computer Vision * Robotics * Environment Reconstruction 


1 Introduction 


Robot manipulators are more frequently used in households and small and medium-sized 
enterprises (SME) [1]. In such applications, the robot’s environment may change over time. 
The robot uses sensors to detect and understand the scene. This includes information about 
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what different work pieces are present and where they are located. A commonly used 
approach is to have a camera as a sensor and combine it with visual object recognition 
techniques. However, only one view on the scene is not always sufficient to identify all 
objects, e.g. due to occlusion or large scenes. The approach of generating new views is 
known as active vision [2]. Based on the scene recorded thus far, new views gathering 
additional information about the scene are determined and incorporated into the world rep- 
resentation. During the movement of the robot to the next view, a human co-worker may 
modify the scene. This invalidates some information in the robot’s world representation. 
Additionally, it is possible, that the sensor captures an unwanted signal, e.g. an arm of a 
co-worker. In both cases, it is not known a priori, which part of the world representation 
is still valid or which part of the signal is useful. Therefore, it is necessary to develop and 
evaluate an approach to handle dynamic scenes as well as unwanted signals. 

In this paper we present a novel approach regarding the handling of dynamic scenes 
using a 3D scene reconstruction. Based on the state of the art (Sect. 2) and our preliminary 
work (Sect.3.1), we identify requirements, make necessary assumptions, and present our 
overall concept (Sect. 3.2). We classify all possible cases of how changes in the environment 
can occur between two recordings (Sect.3.3) and describe a method to detect changes for 
a specific kind of world representation (Sect. 3.4). As in our previous work [3, 4], we use 
boundary representation models (B-Reps). We present an approach to handle changes and 
to incorporate them into the world representation to assure the validity after each view 
(Sect. 3.5). This approach is evaluated by a proof of concept as well as a comparison between 
our method and the ground truth B-Rep (Sect. 4). Finally, we discuss our contribution and 
future work (Sect. 5). 


2 State of the Art 


The detection and handling of dynamic scenes is encountered in several fields of research, 
e.g. in autonomous driving, computer vision, and robotics [5]. In these different applications 
the problem and the solution can be viewed under multiple aspects. On the one hand, the 
type of internal scene representation is from interest. This ranges from LIDAR sensor data 
[6], over point clouds [7, 8] to bounding boxes [9] (e.g. from semantic segmentation). The 
representation of the scene impacts the possibilities of detecting and handling changes. 
When using point clouds, either each pixel must be handled on its own or a segmentation is 
necessary to group multiple points into clusters. In the case of semantic segmentation, these 
clusters are complete objects and can be used to identify changes between two frames. 

On the other hand, changes can be handled in different ways: One method is to introduce 
a time component and aging [10, 11] to remove knowledge, which was not validated for 
a certain amount of time. The basic idea is to attach a certainty to each object instance 
and decrease it with increasing time and human proximity, as humans can only manipulate 
objects to which they are spatially close. Whenever an object is visible in the current view, 
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the certainty is reset. Another approach is based solely on the given model, by comparing 
multiple views and reason, which parts of the scene are still available. This can be done 
by utilizing background detection methods [12], while multiple frames are processed and 
moving objects can be identified in the foreground. Alternatively, two given frames can be 
compared to detect possible changes. Overall the goal is to obtain a valid representation of 
the scene at all times. 

Finally, it is of interest how the new poses for the sensor are obtained: E.g. a human 
operator uses a hand-held camera [10] or a robot system decides autonomously where to 
move [11]. Depending on the application, the requirements regarding the correctness of the 
world representation differs. In some cases, it is acceptable if small movement of an object 
is undetected and therefore not handled. In other applications, it is preferable if each change 
is detected—even if this results in too many detected and handled changes. 

A special case of how new poses are obtained is the explicit tracking of objects [13]. For 
this approach it is necessary, that the object of interest is visible in a majority of all views. 
This assumption is difficult to fulfill in some robotic applications, especially when using an 
eye-in-hand camera. Another special case is visual servoing [14] in which the sensor follows 
the movement of the object. 

Based on the state of the art and our previous research (see Sect. 2), we examine a model- 
based approach regarding dynamic scenes. Especially the usage of B-Reps for detection 
and handling of change is of interest as the vision process utilizing B-Reps as a world 
representation is not well explored [3, 4]. We will focus on a model-based approach in this 
paper. On the one hand, existing aging methods can be applied to model-based approaches 
as well. On the other hand, it is of interest how B-Reps can handle dynamic scenes only 
based on their 3D geometry information. Furthermore, we use a robot-mounted sensor, 
whose motion is controlled by an active vision process for object recognition. Therefore, 
we cannot move the camera to specific poses to detect or further investigate changes. Due 
to this, tracking methods are not applicable. 


3 Our Approach 
3.1 Basic Approach for Static Scenes 


In our previous work we developed an approach for object recognition based on B-Reps [3]. 
In that work, we create an object database, in which every object is stored as a B-Rep. To 
obtain a scene representation, we use a robot-mounted depth camera and the resulting point 
cloud is transformed into a B-Rep [15, 16]. This B-Rep is the input for our object recognition 
approach. We determine multiple sets of hypotheses between the scene and objects from our 
database and select the best fitting one. Some objects may not be recognized with the current 
view; therefore we determine new views by using our active vision approach [4]. At a new 
camera pose we record another point cloud, transform it into a B-Rep, and merge it into the 
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scene representation. Afterwards we apply our object recognition method once again. This 
procedure repeats until each object is correctly classified. The problem of dynamic scenes 
occurs between the capturing of two point clouds from different views. 


3.2 Enhancement for Dynamic Scenes 


In this section we present our overall approach. The reconstruction and object recognition 
are an iterative process, and we want to ensure a valid scene representation after each 
view. Therefore, every change has to be incorporated directly. We primarily focus on the 
faces represented in the B-Rep when handling the dynamic scenes. Faces represent all 
information stored in B-Reps. If the faces are correctly reconstructed, the vertices and edges 
contained within a face are correct as well. Furthermore, faces are a robust and high-quality 
representation of objects, as a face is calculated by averaging over numerous points from a 
point cloud [15]. Therefore, our overall B-Rep is valid and correct if all faces are correct. 
Faces are the most important feature for our object recognition approach, as well as the 
active vision method. 

The first step to handle the problem of dynamic scenes is to categorize the possible 
changes regarding faces. Based on this classification, we have to detect these changes within 
our representation. We capture a point cloud from a new view, convert it into a B-Rep, and 
compare what is currently visible and what should be visible based on the current world 
model from the current position. For each category, we discuss how this comparison can be 
calculated and how it can be detected within a scene. Finally, we handle the detected change 
in two ways. If object hypotheses are available, we utilize this information. If no hypotheses 
are available, we only use the information from the B-Reps given directly by the detected 
change. As an additional assumption (similar to our previous work [3, 4]) the position and 
extent of our working surface is known (in our context a table). 


3.3 Classifying Possible Changes 


Based on our previous work and the given assumptions, we classify the possible changes. 
We have to compare the world model reconstructed thus far and the current view in every 
step to ensure a valid representation. As we do not use a certainty or an aging process, the 
existence of a face in a scene is binary, meaning it exists or it does not. Now we discuss 
every possible case regarding the visibility of the face in the world model and the current 
view: 


1. The face is not stored in the world model and ... 
(a) is also visible in the current view: added face. 
(b) is also not visible in the current reconstruction: As this face does not exist, this case 
is not named and can be omitted. 
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2. The face is already stored in the world model and ... 
(a) is also visible in the current view: validated face. 
(b) is also not visible in the current reconstruction and ... 
(i) it should be visible from the current point of view: removed face. 
(ii) it cannot be seen from the current point of view due to occlusions or camera 
limitations: occluded face. 


3.4 Detecting Changes 


To detect every change between two B-Reps, we describe a method for every case mentioned 
in the section before. The basic approach for every case is to compare what is currently 
visible and what should be visible based in the world model. If we find a difference, we 
know something has changed within the scene and we can categorize this change. 

We project the faces of the world model reconstructed thus far onto the 2D image 
plane of our current pose Tc € R***. The world model is given by the B-Rep W = 
(Fw, Ew, Vw, Bw) (with faces Fy, half-edges Ew, vertices Vw, and boundaries By). 
Since the world model has multiple views incorporated, the 2D projection may contain 
faces, which are not recordable from our current view. Therefore, we have to incorporate 
the view frustum of our depth sensor and further physical limitations (e.g. angle of inci- 
dence). All these limitations are collected in a tuple L. A projection can be described as 
$(B,T,L) > {1,..., ||Fg||}’*”, with a B-Rep B and a pose T € R***. The result stores in 
each pixel, which face is in the front (by using an ID). So we can obtain the projection of the 
B-Rep W as Pw = $(W, Tc, L). We repeat this procedure with the reconstructed B-Rep 
C = (Fc, Ec, Vc, Bc) from the current view. Therefore, we have two 2D projections: One 
of the world model reconstructed thus far, and one of the reconstruction from the current 
view Pc = $(C, Tc, L). It should be noted that we still have full knowledge which face in 
the 3D representation belongs to which pixels in the 2D projection. We can now compare 
these two projections by looking at every projected face. For each face we can search for a 
correspondence in the other projection. We can use these projections to determine whether 
a face f is visible (1) or not (0) from a pose T, if another B-Rep B is present, regarding 
physical limitations L as g(f, B, T, L) — {0, 1}. 

In addition, correspondences between faces (in the 3D representation) are calculated by a 
function n( f, g) > {0, 1}, based on their position, normal vector, and size. In our previous 
work [4], 7 matches the explainedby-function. This function returns for two faces f, g 
whether they correspond (1) or not (0). 

If a face is visible in the current view but not in the world model, we conclude that it was 
added, resulting in A = {g € Feldf € Fw : n(g, f)}. Ifa face of the world representation 
is also visible in the current view, then there is no change but a validation. This set of 
validated faces is determined as V = {f € Fwldg € Fe : n(f, g)}. The case where a 
face is visible in the world model but not in the current view has to be subdivided in two 
cases. We have to make sure to detect occlusions correctly. If no correspondence is found, 
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but it should be perceivable from the current pose, we know that this face was removed. 
This can be denoted by R = {f e Folge € Fo: n(f,g)A¢C,C, Tc, L)}. If a face 
has no correspondence, and is also not perceivable from the current pose, it is occluded 
O = {f € Fwlig € Fo: n(f.8) A7@(f.C, Te, L)}. 


3.5 Handling Changes 


So we have now the information for each face whether it should exist in the updated world 
representation. Our goal is to obtain a valid representation of the whole scene after each 
step. In our domain with complete objects it is impossible for faces to exist on their own as 
they always originate from an object. If a single face of an object is missing, the complete 
object should be removed. 

The set V of validated faces can be handled directly by B-Rep merging [15]. The same is 
possible with the added faces A. As no decision can be made regarding the occluded faces 
O, we decide that they remain within in world representation. If they are removed, this will 
still be captured later in the active vision process. 

Therefore, the set of faces to remove R remains. As we know the object instances 
H = {hj,..., ho} (with every h; containing at least a B-Rep model representing the 
object), we can handle the detected changes in two separate ways: If the face that 
should be removed corresponds to an existing object hypothesis, we remove the complete 
hypothesis. This is done by determining and deleting all faces in the world representa- 
tion, which correspond to the hypothesis. First, we determine the set of faces to delete 
Duy = Uln;eHl3rer:re f (hit (hi). These are all the faces of hypotheses that correspond to 
a face to remove. By f(h;) we obtain all faces from the B-Rep W corresponding to the 
B-Rep model of hypothesis h;. Furthermore, we must remove all faces directly connected 
to the hypothesis, to ensure a valid world representation (this originates from B-Reps as 
the underlying data structure). Therefore, the final faces to delete can be determined by 
Dy, = Dm U {f € Fwlag € Dip : neighbor(f, g)}, where neighbor denotes, 
whether two faces f and g are neighbored, regarding their half-edges. 

If no hypothesis is available for a face, we remove all neighboring faces (meaning they 
share an edge). This process is repeated transitively, but the working surface is removed 
beforehand and therefore the procedure stops there. First, we determine the set off faces 
without a correspondence as Din = {r € Rhi eH:re f(hi)} = R\Dmo. Now, we 
add the neighboring faces by Da, = Da, Ulfe Fwlag € Da, :neighbor(f, g)}. 
On the one hand, we have to delete multiple faces, as we do not know which object may 
correspond to these faces. On the other hand, we have to ensure the validity of the B-Rep. 
This has the effect, that too many faces may be removed. However, the faces remaining 
within the scene can be examined later using the underlying active vision approach. 
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4 Evaluation 


4.1 Setup 


The evaluation is split into two parts. On the one hand, we validate our classification of the 
different types of change. To do this, for each change type a scene is recorded which contains 
exactly one change. Furthermore, we validate the usefulness regarding scene unspecific 
signals, e.g. a recorded human. On the other hand, we evaluate our approach by comparing 
the reconstruction of a dynamic scene and a static one. To achieve this, we build a scene, 
record it with our active vision approach and the handling of dynamic scene enabled. When 
the reconstruction is completed, we use only the active vision approach on the now static 
scene, to obtain a ground truth. To compare both reconstructions, we use these criteria: First, 
we count the number of faces. Second, we remove all faces which are in both reconstructions. 
To determine to which faces this applies we use the definition of 7, which determines 
whether two faces correspond to each other. If the number of unexplained faces is low, the 
two reconstructed scenes are similar. Finally, we delete all faces which are explained by 
manually validated hypothesis. This is necessary, because some faces may be impossible 
to view in the static reconstruction due to occlusion. Additionally, our goal is a correct 
recognition of all objects, and not a complete reconstruction of the scene. If any faces 
remain afterwards, an error occurred during the reconstruction respectively the handling of 
dynamic scenes and should be investigated further. 

As a hardware setup we use a KUKA LWR 4 robot with a hand/eye calibrated ENSENSO 
N10 depth camera. To ensure high quality point clouds, we average over multiple from one 
view to reduce the impact of noise. We utilize an object database with 25 instances [3], which 
consist of objects from different domains and complexity levels considering the number of 
faces, symmetry, and convexity. 


4.2 Results 


In the validation, we start with the removal of scene unspecific signals, as seen in Fig. 1. 
In a first scene, a human arm is reconstructed by multiple, planar segments. With the arm 
removed and another B-Rep incorporated into the scene, the segments are identified as 
removed and deleted from the reconstruction. Only one patch remains, since it is too small. 
Furthermore, more objects are classified correctly, as the arm occludes some of these. 
Regarding the validation of every possible change case, the removed one can be seen 
in Fig.2. From an initial pose two objects are reconstructed and identified correctly. The 
robot manipulator moves to a new pose to validate the hypotheses. One of the objects is 
removed in between and the corresponding faces are deleted in the resulting reconstruction. 
Additionally, the resulting gap in the table is closed. The next case is the occluded one, as 
shown in Fig. 3. First, one object is visible. In front of it, two more objects are placed, which 
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Fig. 1 The reconstruction of a scene with an irrelevant signal (left, surrounded in green) and the 
resulting reconstruction after another image from the same pose was incorporated (right). The B-Rep 
is drawn in gray; the hypotheses are the red wire frame models. The arrows indicate possible next 
views for the active vision process. The coordinate system indicates the base of the robot 
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Fig. 2 Removing an object from the initial reconstruction (left) and the resulting representation 
(right). Camera frustum projection on the table is drawn in black 
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Fig. 3 Removing and adding multiple objects in one step. An additional view is necessary to delete 
the removed object due to occlusion 
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occlude the first one. Furthermore, the now occluded object is removed. In the resulting 
representation, the new objects are added and the previous one is not deleted, due to the 
occlusion. As we cannot be sure, whether this object is still there or not, it should remain 
inside the representation. Finally, another image is taken from a different view (from which 
the first object should be visible) and the object is removed from the representation, as we 
can be sure, that it is not there anymore. The remaining two cases added and validated were 
evaluated as well but are not shown with figures here because the handling is done with 
existing and already evaluated methods. 

For our evaluation, we used five scenes with different objects and overall complexity. 
Each scene was modified by multiple changes (moving, removing, and adding multiple 
objects and generating unwanted signals). One scene is visible in Fig.4. Accumulated over 
all scenes we gathered 126 faces for the dynamic case and 120 for the static one. 97 faces 
from the dynamic reconstruction had a match in the static one, and 98 faces the other way 
around, meaning one face was explained by two others. This occurs e.g. when the complete 
face was not captured by the sensor and two patches were reconstructed instead of one 
complete face. Furthermore, 29 faces had no correspondence to the static reconstruction (28 
in the other case). However, only 1 face was left after deleting all faces with hypotheses 
correspondences. The high number of faces without a correspondence originate primarily 
from occlusion during the static reconstruction. Furthermore a few small faces are impossible 
to directly look at using active vision, due to collision prevention mechanics (e.g. if the face 
is close to the working area). Therefore, in some reconstructions a face may be present, 
as it was captured together with a neighboring face (which may not be the case in another 
reconstruction). The remaining | face occurs because of too similar properties of an old and 
a new face. An object was removed from the scene, and another object was placed there 
instead. One face of the new object has the same properties, here face area and normal, 
regarding the function 7. Therefore, the old face was not deleted. Depending on the face it 
may be possible that it gets removed if the camera takes a direct look at it and the algorithm 
is able to differentiate between the old and the new one. 

Based on these results, we can conclude the usefulness of our approach: On the one hand, 
we can successfully tackle the problem of unwanted signals. If any scene unrelated part is 


Fig. 4 One example scene of the evaluation. On the left the final world representation and pose of 
the dynamic evaluation is visible, and on the right the one for the static 
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captured, it is investigated further by the active vision method and therefore deleted as soon 
as the disrupting object is removed. On the other hand, we can handle changes that occur 
during the robot movement as seen in our validation and evaluation. 


5 Conclusion 


We present a novel approach to detect and handle dynamic scenes for a special type of 
representation. Our approach uses a categorization of all possible types of change for B- 
Reps. To detect these changes we compare the world model thus far and the current view. 
To determine the type of change we project both the world representation and the current 
view onto a 2D plane and compare what we should see. Afterwards, the detected change 
is handled, either by utilizing existing object hypothesis or only the geometric information 
from the scene. With an evaluation we conclude the usefulness of our approach for using 
B-Reps in object recognition. 

Future work may include the usage of a time-based component to delete world model 
entries which were not validated within a certain period. Furthermore, different techniques 
of how much of the scene should be deleted if a face is missing can be implemented and 
evaluated. Finally, other representations than B-Reps are of interest, which are easier to keep 
valid when deleting faces. 


Acknowledgements This work has partly been supported by the Deutsche Forschungsgemeinschaft 
(DFG) under grant agreement He2696/21 SeLaVi. 


References 


1. Cencen, A., Verlinden, J.C. , Geraedts, J.M.P.: Design methodology to improve human-robot 
coproduction in small- and medium-sized enterprises. IEEE/ASME Trans. Mechatron. 23(3) 
(2018) 

2. Chen, S., Li, Y., Kwok, N.: Active vision in robotic systems: a survey of recent developments. 
Int. J. Robot. Res. 30(11) (2011) 

3. Rohner, D., Henrich, D.: Object recognition for robotics based on planar reconstructed B-rep 
models. Third IEEE International Conference on Robotic Computing (IRC) (2019) 

4. Rohner, D., Henrich, D.: Using active vision for enhancing an surface-based object recognition 
approach. Fourth IEEE International Conference on Robotic Computing (IRC) (2020) 

5. Radke, R. J., et al.: Image change detection algorithms: a systematic survey. IEEE Trans. Image 
Process. 14(3) (2005) 

6. Postica, G., Romanoni, A., Matteucci, M.: Robust moving objects detection in Lidar data exploit- 
ing visual cues. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 
(2016) 

7. Litomisky K., Bhanu, B.: Removing moving objects from point cloud scenes. International 
Workshop on Depth Image Analysis and Applications (2012) 


Detection and Handling of Dynamic Scenes During ... 289 


8. Newcombe, R., Fox, D., Seitz, S.: DynamicFusion: reconstruction and tracking of non-rigid 
scenes in real-time. IEEE Conf. Comput. Vis. Pattern Recognit. (2015) 
9. Mei, J., etal.: Semantic segmentation of 3D LiDAR data in dynamic scene using semi-supervised 
learning. IEEE Trans. Intell. Trans. Syst. 21(6) (2019) 
10. Izadi, S., et al.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth 
camera. 24th Annual ACM Symposium on User Interface Software and Technology (2011) 
11. Riedelbauch, D., Werner, T., Henrich, D.: Supporting a human-aware world model through sensor 
fusion. 26th International Conference on Robotics in Alpe-Adria-Danube Region (2017) 
12. Sengar, S.S., Mukhopadhyay, S.: Moving object detection using statistical background subtrac- 
tion in wavelet compressed domain. Multimed. Tools. Appl. 79(9) (2020) 
13. Fan, L., et al.: A survey on multiple object tracking algorithm. IEEE International Conference 
on Information and Automation (ICIA) (2017) 
14. Kragic, D., Christensen, H.I.: Survey on visual servoing for manipulation. Technical report. 
Computational Vision and Active Perception Laboratory, Stockholm (2002) 
15. Sand, M., Henrich, D.: Incremental reconstruction of planar B-Rep models from multiple point 
clouds. Vis. Comput. 32(6) (2016) 
16. Sand, M., Henrich, D.: Matching and pose estimation of noisy, partial and planar B-rep models. 
Computer Graphics International Conference (2017) 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 

The images or other third party material in this chapter are included in the chapter’s Crea- 
tive Commons license, unless indicated otherwise in a credit line to the material. If material is 
not included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


® 


Check for 
updates 


An Integrated Approach for Hand Motion 
Segmentation and Robot Skills Representation 


Shuang Lu ©, Julia Berger and Johannes Schilp 


Abstract 


In this work, an approach for robot skill learning from voice command and hand move- 
ment sequences is proposed. The motion is recorded by a 3D camera. The proposed 
framework consists of three elements. Firstly, a hand detector is applied on each frame 
to extract key points, which are represented by 21 landmarks. The trajectories of index 
finger tip are then taken as hand motion for further processing. Secondly, the trajectories 
are divided into five segments by voice command and finger moving velocities. These five 
segments are: reach, grasp, move, position and release, which are considered as skills in 
this work. The required voice commands are grasp and release, as they have short dura- 
tion and can be viewed as discrete events. In the end, dynamic movement primitives are 
learned to represent reach, move and position. In order to show the result of the approach, 
a human demonstration of a pick-and-place task is recorded and evaluated. 
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1 Introduction 


The demand for customized products has been increasing rapidly in the last decades. The 
manufacturing process should be adjusted upon individual request. Collaborative robots 
can work with human workers hand-in-hand for assembly tasks, which can improve the 
flexibility in task execution. However, the application of hybrid system is still in its infant 
stage. One obstacle is the complex robot programming process. Another one is the required 
expertise from the worker for each specific type of robot. Moreover, the tasks have to be 
re-programmed each time a new request is received from the factory. It is time consuming 
and causes higher production cost. 

Learning from demonstration is a promising programming paradigm for non-experts. 
Kinesthetic teaching is widely explored in the last decades for data collection [1]. However, 
the process can be a tedious task for a human worker especially for multi-step tasks. Instead 
of guiding the robot directly by hand, visual observation gains more attention recently, thanks 
to the development in field of computer vision. Hand movement can be tracked and recorded 
by optical sensors. The trajectories from demonstration are then segmented to elementary 
action sequences such as pick-and-place objects, which are also known as skills. A task 
model is then defined as a sequence of skills [2]. The basic motions (reach, grasp, move, 
position and release) in methods-time measurement (MTM) [3] are considered as skills in 
this work, such that a learned task model can be optimized in a more flexible way during 
execution. For instance, the move motion can be optimized while the reach and grasp remain 
unchanged. The representation is also beneficial for integrating natural language as voice 
command such as grasp and release, since they can be considered as discrete event both for 
human speaking and for robot execution. The aim of this work is to develop a framework, 
in which the robot is able to learn a task from integrated natural language instruction and 
video demonstration. The main contributions of this work are: 


e Proposal of pipeline to extract hand motion from 3D video sequences. 

e Proposal of integrating voice commands with velocity-based motion segmentation. 

e Definition of skills according to methods-time measurement (MTM): extraction of dis- 
crete skills from voice command and extraction of continuous skills from visual obser- 
vation. 


2 Related Work 


This section provides a summary of recent literature on robot learning from visual observa- 
tion. Ding et al. developed a learning strategy for assembly tasks, in which the continuous 
human hand movement are tracked by a 3D camera [4]. Finn et al. presented a visual imi- 
tation learning method that enables a robot to learn new skills from raw pixel input [5]. 
It allows the robot to acquire task knowledge from a single demonstration. However, the 
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training process is time consuming and the learned model is prone to environment changes. 
Qiu et al. presented a system, which observes human demonstrations by a camera [6]. A 
human worker demonstrates an object handling task wearing a hand glove. The hand-pose is 
estimated based on a deep learning model trained by 3D input data [7]. The human demon- 
stration is segmented by Hidden Markov Models (HMM) into motion primitives, so-called 
skills. The skills are represented by Dynamic Movement Primitives (DMPs), which allows 
the generalization to new goal positions. But there are no rules for defining semantic of 
skills in the existing works. Pick up, place and locate are considered as skills by Qiu et al. 
[6], however Kyrarin et al. define them as start arm moving, object grasp, object release 
[8]. This causes difficulty when comparing the performance of different approaches. Shao et 
al. developed a framework which allows robot to learn manipulation concepts from human 
visual demonstrations and natural language instructions [9]. By manipulation concepts they 
mean for instance “put [something] behind/into/in front of [something”. The model’s inputs 
are natural language instruction and an RGB image of the initial scene. The outputs are the 
parameters of a motion trajectory to accomplish the task in the given environment. Task 
policies are trained by integrated reinforcement and supervised learning algorithm. Instead 
of classifying all possible actions in video demonstration, the focus of this work is to extract 
motion trajectories from each video. 


3 Motion Segmentation 


In this section, the methods for extracting hand motion trajectories and segmentation are 
described. 


3.1 Data Collection 


This work aims to extract human motion from video sequences, which consists of both color 
and depth information of hand motion. Given recorded motion data, a pipeline consisting 
of the following three steps is proposed. Firstly, the objects which are more than one meter 
away from camera origin will be removed. Since the depth and color stream have different 
viewpoints, the alignment is necessary before further processing. In the second step, the 
depth frame is aligned to the color frame. The resulted frames have the same shape as the 
color image. Thirdly, the hands are detected by MediaPipe framework [10] from recorded 
color image sequences. The output of the hand detector are 21 3D hand-knuckle coordinates 
inside of the detected hand regions. Figure 1 (a) shows an example. The representation of each 
landmark is composed of x-, y- and z-coordinate. x and y are normalized to [0.0, 1.0] by 
the image width and height respectively, z represents the landmark depth the wrist being the 
origin. The illustration of landmarks on the hand can be found on the website of MediaPipe!. 


1 https://google.github.io/mediapipe/solutions/hands.html 


294 S.Lu et al. 


Fig.1 (a) Detected hand key 
points on color image; (b) 21 
Hand joints position with wrist 
as origin 


If the hand is not detected, the time stamp will be excluded from the output time sequences. 
Otherwise, key points in pixel coordinates are transformed to camera coordinate system in 
Fig. 3(a). An detailed illustration of hand landmarks in world coordinate system with wrist 
being the origin is outlined in Fig. 1(b). A flowchart of the proposed pipeline is summarized 
in Fig. 2. 


3.2 Motion Representation 


The goal of motion segmentation is to split the recorded time series into five basic 
motions: reach, grasp, move, position and release [11]. It builds up the moving cycle of 
pick-and-place in multi-step manipulation tasks. The trajectories can be represented by 
P(x, y,z) = [p(to), p(ti),..., p(t), ---, P(fn)], where t; represents the temporal infor- 
mation. The segmentation task is to define the starting and ending time for each motion. 
Grasp and release are two basic motions with short duration. By recognizing voice com- 
mands from human, the time stamp of the voice input can be mapped to hand motion 
trajectories. The move- and position-motions are segmented by hand moving speed. It is 
based on the assumption that the speed of position-motion decreases monotonically. The 
results of segmentation are outlined in Table 1. 


4 DMP for Skills Representation 


Dynamice Movement Primitive (DMP) is a way to learn motor actions, which are formal- 
ized as stable nonlinear attractor systems [12]. There are many variations of DMPs. As 
summarized by Fabisch [13], they have in common that 


e they have an internal time variable (phase), which is defined by a so-called canonical 
system, 

e they can be adapted by tuning the weights of a forcing term and 

e a transformation system generates goal-directed accelerations. 
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Input: recorded color and 
depth image seqences 


Remove the background of objects more than one meter away 


Align the depth frame to color frame 


Hand detection on color frame 


Hand detected? Exclude the time stamp 


Transform the key points from pixel coordinates to camera coordinates 


Depth estimation of detected hand key points 


Output: time sequences of 
3D coordniates of 21 hand 
landmarks in camera frame 


Fig. 2 Flowchart of the process for generating hand motion trajectories 
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Table 1 Representation of skills 


Skills Representation 

Reach [p(to), pti), ---, P(t)] 

Grasp G(t;) 

Move [p(t-+1), P(tr+2).--+s P(tm)] 
Position [PUm+1), P(tm+2), +--+» P(tp)] 
Release R(tp) 


The canonical system uses the phase variable z which replaces explicit timing in DMPs. 
The values are generated by the function: 


TZ = —az (d) 


where z starts from | and approaches 0, t the duration of the movement primitive, and a is 
some constant that has to be set such that z approaches 0 sufficiently fast. The transformation 
system is a spring-damper system and generates a goal-directed motion that is controlled by 
a phase variable z and modified by a forcing term f. 


tu = K(g—y)— Dv — K(g— yo)z + Kf (z) 


Ty =v 


(2) 


The variables y, y, y are interpreted as desired position, velocity and acceleration for a 
control system, yo is the start and g is the goal of the movement. The forcing term f can be 
chosen hypothetically: 

Zi view 
Die Hi) 
with parameters w that control the shape of the trajectory. Influence of the forcing term decays 
as the phase variable approaches 0. yj (z) = exp(—£(z — c;)*) are radial basis functions 
with constant d; (widths) and c;. The DMP formulation presented in [14] is considered in 

this work, such that the desired velocity can be incorporated. 


f@= (3) 


5 Experimental Setup 


The setup is illustrated in Fig.3 (a). The Intel® RealSense™ L515 3D camera is mounted 
on the robot to record hand movements. The working space of robot can be captured by 
camera, as shown in Fig. 3 (b). As a LIDAR camera, it projects an infrared laser at 860 nm 
wavelength as an active light source. 3D data is obtained evaluating the time required to the 
projected signal to bounce off the objects of the scene and come back to the camera [15]. 
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The size of color images recorded by the L515 is 1280 x 720 and the size of depth image 
is 640 x 480. Intel RealSense Viewer is used to record the video sequences. The natural 
language text for segmentation is manually inserted into the recorded sequences. 


6 Experimental Results 


To validate the proposed approach, a task demonstration is recorded with the setup in Fig. 3, 
in which the human demonstrated a pick-and-place task. The methods proposed in Sects. 3 
and 4 are applied on the recorded sequence. The results are discussed in the following. 


6.1 Segmentation 


Index finger tip trajectory in X is outlined in Fig.4. It shows that some data are missing 
due to depth error in the data collection process. The segmentation result based on the 
voice command grasp and release can be also found in Fig. 4. In the next step, the sequence 


position (m) 


i it H 
Li refeage release relehse release __ 
00 OS 10 15 20 25 30 35 40 45 30 SS 60 65 70 75 BO BS BO BS 100 30.5 11.0 11.9 12.0 125 13.0 


time (s) 


Fig.4 Hand motion segmentation based on voice command 
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Fig.5 Segmentation into move and position 
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Fig.6 Learned DMP for representing reach 


between grasp and release is segmented into move and position, as illustrated in Fig. 5. This 
is achieved by defining the temporal information of voice command input. 


6.2 Learning DMPs 


Before learning the DMPs, a linear interpolation is applied on both time and trajectory data. 
Additionally, time series are shifted such that every trajectory starts at time zero. Three DMPs 
are learned for representing trajectories of X, Y and Z. The implementation by Fabisch [13] 
is used to learn the DMPs?. For the sake of simplicity, only X and Y are in Fig. 6 and Fig. 7. 
The learned model can be adapted to new goal position such as 0, 1 or 2. 


2 https://github.com/dfki-ric/movement_primitive 
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Fig.8 Learned DMP for representing move 


To represent the move motion, it is essential that the DMPs can be adapted to different final 
velocities. The results for trajectories and velocities in Fig. 8 show that the learned DMPs 
can be adapted to different final velocities such as 0, 1 and 2, where x, and yg represent the 
goal velocity. Furthermore, it shows that the recorded hand trajectories are not smooth can 
not be applied on robot directly. The smoothness can be improved by the learned DMPs. 
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7 Conclusion 


To reduce the complexity of robot programming, an integrated approach is introduced in 
this work for robot learning of skills from voice command and a single video demonstra- 
tion. The extracted index finger trajectories from video are firstly segmented into five basic 
motions: reach, grasp, move, position and release. It is realized by voice input of grasp 
and release during video recording. Followed by segmenting move and position by hand 
moving velocities. DMPs are then learned to represent reach, move and position. They are 
adaptable to new goal positions and velocities. The experiment results show the feasibility 
of the proposed approach. As future works, the data missing problem caused by depth error 
should be addressed. Furthermore, the learned DMPs should be evaluated on real robot. 
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Abstract 


In lithium-ion battery (LIB) production, limp electrodes are handled gently by vacuum- 
pressure based handling and transport systems, which generate a fluid flow that prop- 
agates through the porous electrode coating during handling. To investigate the limits 
and material-damaging behavior of vacuum pressure-based handling, it is required to 
understand how process parameters and electrode qualities affect fluid flow. Questions 
on how fluid flow reduces electrode quality are insufficiently addressed or modeled. 
Modeling the electrode and handling system interaction requires knowledge of the effec- 
tive surface geometry and the volumetric flow rate caused by the pressure difference. In 
this article, flow through porous electrode coatings during handling is modeled. Experi- 
ments demonstrate a flow behavior according to the generalized Darcy’s law. Thus, using 
Darcy’s law, modeling fluid flow through the electrode improves the exploration of the 
limits and design of vacuum pressure-based handling and transport of electrodes in LIB 
production. 
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1 Introduction 


It is cost-intensive to optimize the production of lithium-ion batteries (LIB) through time- 
consuming experiments. To increase development efficiency, modeling methods are used 
that provide adequate results for a given quality. This article further increases development 
efficiency by simplified modeling. 

LIBs are composed of electrochemical cells consisting of cathodes and anodes, both called 
electrodes, with or without separators and any electrolyte. The cell assembly produces the 
electrode-separator compound (ESC), a stack, a Z-fold, or a roll. Cell production includes 
handling and transport of the electrodes between their manufacturing and sealing of the ESC. 
Electrode and LIB quality is affected by handling and transport parameters. Vacuum suction- 
based adhesion is a gentle way to handle and transport electrodes fast. The background of this 
article is to model the mechanical stress and damage that occur during handling and transport 
in the electrode. This article aims to model the interrelationships between the electrode active 
material (AM) properties, i.e., porosity, and vacuum suction-based adhesion parameters, i.e., 
pressure and volumetric flow rate during contact, to enable local stress and damage modeling 
of the AM. 

The following describes the structure of the article. This section begins with a brief 
overview of the field of study. Section 2 provides fundamentals of suction-based handling and 
transport in cell assembly. It summarizes the modeling of electrode handling and transport. 
It motivates a flow model for suction-based adhesion of electrodes. Section 3 illustrates the 
physical problem and presents modeling based on Darcy’s law for vacuum suction cups 
and effective vacuum surfaces. In Sect.4, several electrodes are experimentally examined 
to evaluate the model. Finally, Sect.5 concludes how the model enhances the design of 
handling and transport solutions for LIB assembly. 


2 Scope and Motivation of Fluid-Dynamic Electrode Model 


Electrodes must be gripped securely, fixed reliably, and not damaged during handling and 
transport [1]. Handling and transport can be carried out through adhesion using vacuum 
suction at the material and application-specific limitations. Compelling examples in industry 
and research are introduced in the following. 

Adhesion is modeled with uniform pressure distribution, and flow through the electrode 
is neglected to account for loading in widely used modeling methods. Modeling methods 
for local damage modeling that would benefit from improved flow modeling are considered 
below. 

Section 2.1 introduces applications for vacuum suction-based handling and transport of 
electrodes. Section 2.2 analyses methods for modeling stress during electrode transport. Sec- 
tion 2.3 motivates a modeling approach for adhesion during electrode transport to increase 
development and commissioning efficiency. 
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2.1 Relevance of Vacuum Suction-Based Handling and Transport 
of Electrodes in Battery Production 


In cell assembly, vacuum suction cups and area grippers are means for adhesion during han- 
dling [2]. These grippers adhere the electrode force-locked by a negative pressure difference 
between the inner and the ambient pressure. This gripping principle is popular for handling 
air-impermeable materials. However, electrode surfaces are air-permeable, which this arti- 
cle models and demonstrates by experiment. In the following, applications for electrode 
handling and transport are introduced and illustrated as per Fig. 1. 

Vacuum suction cups are relatively inexpensive and stand out with high accuracy. The 
downsides include an increased risk of AM abrasion, electrode sheet absorption, and marks. 
For vacuum suction cups, the resulting flow through AM is modeled and experimentally 
evaluated hereafter. 

Vacuum effective surfaces inherit multiple openings at low pressure to distribute the load 
resulting from the motion of electrodes. In the following example applications like vacuum 
area grippers, vacuum-deflection rolls, vacuum draw-off rolls, and vacuum-conveyor belts 
are described. 

Vacuum area grippers have a high lateral force absorption and a high deposition accuracy. 
They reduce the risk of damage by distributing the gripping force across the entire electrode 
[2]. The model presented in the article covers the flow and pressure resulting from the 
multiple openings of these grippers. 

Vacuum draw-off rolls were used to separate electrodes from a pile in the projects Kon- 
tiBAT and HoLiB at the research group of the authors. An effective vacuum area in the roll 
(adjustable with suction insert) provides adhesion to orient and accelerate electrodes [3, 4]. 
This article models the flow resulting from the suction inserts. 

Vacuum deflection rolls implemented in the KontiBAT project adhere, deflect and guide 
the electrodes while maintaining a constant material velocity. Vacuum-deflection rolls have 
also been used to ensure constant process velocity of continuous web-based electrodes [5]. 
The model developed in this article is applicable to flow caused by vacuum deflection rolls 
adhesion area. 
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Fig.1 Vacuum suction-based handling and transport systems for LIB electrodes 
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Vacuum conveyor belts have been used to transport electrodes [6]. During transport, a 
lower pressure under the perforated belt ensures the fixation of the electrodes. The model 
developed in this article applies to vacuum conveyor belts. 

In summary, vacuum suction is used in various applications to adhere and guide electrodes 
in space. The following discusses how the interaction of the applications and electrodes is 
modeled. 


2.2 Earlier Work in Modeling Handling and Transport of Electrodes 


In addition to modeling the kinematics of the electrodes, there are approaches capable of 
modeling effects on the electrode material to derive measures for the design of handling 
processes. Some approaches are subsequently evaluated. 

Finite element method (FEM) has been used for the characterization of mechanical 
stresses, and their occurrence during handling processes on the electrode surfaces [3, 4]. 
FEM relies on continuous macroscopic models. The external loads on the electrodes are 
modeled as uniform surface stresses to represent, e.g., the suction pressure [3]. FEM cannot 
model the load from volumetric flow through the electrodes, which can be done with the 
model in this article. 

Computational fluid dynamics (CFD)-FEM simulations are used to map the movement 
of the electrode foils in air-filled space and to model the use of different operating materials. 
This is particularly suitable for mapping macroscopic processes and determining the effects 
of forces on the contact points of the electrodes [4]. Modeling the volume flow through the 
electrode material at the contact point has not been done with CFD-FEM to the authors’ 
knowledge. This article models the volumetric flow through the electrode AM. 

Discrete element method (DEM) models aimed at reproduction of mechanical testing 
like nanoindentation and on the reproduction of mechanical behavior due contact with a 
handling system [3]. There is no modeling of mechanical stresses on the electrode at the 
microscopic level from vacuum suction-based handling and transport with DEM. This arti- 
cle’s modeling enables it. 

In summary, none of the existing approaches modeled flow through the electrodes to the 
authors’ knowledge. If the flow through the electrode material and its effects can be better 
modeled, the attempts to parameterize the systems could be reduced. This reduction serves 
the goal of development efficiency and the scope of the article. As a result, the focus of this 
article is on modeling the flow through the electrode material. 


2.3 Discussion 


Research aims to increase electrode dimensions for the application in e-mobility. However, 
this increases the stress at the AM that interacts with the handling and transport system 
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during cell assembly. The industry looks for AM that is easy to handle and has a high 
energy density because the handling parameters play a decisive role in the competitiveness 
of production. In the development of new materials, only a few handling properties (e.g., 
electrode strengths) are taken into account since other interrelationships are missing. 

During ramp-up or changeover of production, the equipment is parameterized to elec- 
trodes, the speed is increased, the resulting quality is validated, and then the production speed 
is further increased. In addition, the time the equipment runs continuously is increased to 
identify long-term effects on the materials. These processes are time-consuming and ineffi- 
cient. 

Knowledge of the interrelationship between the handling parameters and the effects on 
electrode quality, or the ability to estimate them, would save many resources during develop- 
ment and commissioning. Up to now, there is a lack of investigation of the interrelationship 
between electrode properties, such as porosity, and process parameters, such as pressure 
difference and volumetric flow rate of typical vacuum suction-based handling and transport 
operations in LIB production. This article is the first step to model the effects of macro- 
scopic handling parameters on the electrode microstructure’s quality. The approach begins 
in modeling the adhesion at the macroscale, to derive flow properties from there, to model 
the effects on the morphology of the electrodes at the microscale. 


3 A Model for Vacuum Suction Flow Through Electrodes 


A model for vacuum suction flow through electrodes based on the generalized form of 
Darcy’s law is presented in this chapter. The prerequisites and principles for the flow descrip- 
tion in porous media are described in the following subsections. Section 3.1 models the flow 
through electrodes for a vacuum suction cup. Section 3.2 models the flow through electrodes 
for effective vacuum surfaces. 


3.1 A Flow Model for Vacuum Suction Cup Gripper 


One can use generalized Darcy’s law to model the flow, i.e., the pressure and velocity 
distribution through the electrode from a vacuum suction cup gripper. Darcy’s law relates the 
volumetric flow rate Q with the pressure difference A p over the porosity & and permeability 
K of porous media [7]. One assumption is made: the flow through the electrode is assumed to 
be two-dimensional, with a small channel height h compared to the inner radius r;, h << ri, 
creating a radial plane flow channel as per Fig. 2. 

The radial flow channel is characterized by the boundary conditions for pressure p;, po 
at the vacuum suction cup’s outer and inner radius r;, ro. The integration of generalized 
Darcy’s law and identification of the pressure difference Ap = po — pi yields pressure 
distribution through the electrode 
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Fig.2 Flow channel (/eft), parameter study of pressure and velocity (middle), determination relation 
for electrode permeability (right) for vacuum suction cups 
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For a given pressure difference, (1) yields a model of the pressure through the electrode, 
as done for three different vacuum suction cups in Fig. 2. It can be seen that the pressure 
increase varies slightly with the geometry of the vacuum suction cups. From Eq. (1) and 
generalized Darcy’s law, one gains the radial velocity 


K Ap 
$ - n -r logfro/ri) 


ur(r) = (2) 

A parameter study of (2) for different vacuum suction cups (see Fig. 2) illustrates velocity 
of the flow. The highest flow velocity is at r;, the lowest at ro. Also, one can see that 
small vacuum suction cups have a higher average local radial velocity than bigger ones. For 
evaluation of the model, one can measure the velocity resulting from the pressure difference. 
Since the volumetric flow rate Q, is more comfortable to measure; it is handy to integrate 
Eq. (2) to 
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3.2 A Flow Model for Vacuum Effective Surfaces 


The flow model through electrodes for effective vacuum surfaces uses potential analysis 
of flow. Potential flow models can be applied to viscous flows between closely spaced 
plates, which applies to vacuum adhering of electrodes. Moreover, a constant fluid density 
pr is assumed. The electrode plane is understood as a complex numerical plane with z = 
x+iy € C (as illustrated in Fig. 3). In the plane, the potential is represented as a real part ® 
of a holomorphic function f(z) = ®(x, y) +iW (x, y), the imaginary part W its resulting 
flow direction. Where real and imaginary parts of the complex velocity potential satisfy the 
Laplace equation in the plane. 
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Fig.3 Illustration of flow channel (left), parameter study of pressure and velocity through electrode 
from a vacuum effective surface with 30 circular openings (right) 


From f, assuming a homogeneous permeability K of the electrode, one gets a relation for 
the pressure distribution of multiple openings pp (Zn = Xn + iyn) for any vacuum effective 
surfaces with circular suction areas. Each opening into which is flowing a quantity Qs », 
fluid per unit AM thickness per unit time, contributes to the pressure distribution [7]. From 
these assumptions one can determine the velocity distribution in x, y direction uy, u, as 
well as in its average value u in flow direction. 


1 1 
u = "6-20 >; Qs n ——— (4) 


IZ — zul 


If all Qs , are known Eq. (4) allows to model u, with porosity @, through the electrode 
AM. From a practical point of view, it is interesting to model the fluxes’ values Q, „ with 
generalized Darcy’s law and solve a linear system of equations for known opening pressures 
Pn. Since Ap is a design parameter of vacuum handling and transport system, and the 
separate openings are often connected to the same vacuum-pressure reservoir, the openings 
are considered to have the pressure p;. 

For the pressure model, it is convenient to introduce a Green’s function G. Which is 
defined as a solution of Laplace’s equation, symmetrical in two points (x, y), (x’, y’) (some- 
times called mirror charges), possessing a logarithmic singularity when (x, y) = (x’, y’) and 
vanishing when (x, y) is a point on the boundary 0S of the region in question [7]. When G 
is found, the pressure distribution for one circular vacuum-pressure region on a rectangular 
region can be calculated as 


1 / / dG f / 
poy) = | a (5) 

In Eq. (5), pp is the value of p at the region’s boundary. The line integral elements are 
denoted by ds, n’ is the exterior normal to ds, and the integral extends over the whole 
boundary 0S. Since Laplace equations allow superposition of their solutions due to their 
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linearity, one can solve Eq. (5) for multiple circular openings of an effective vacuum surface. 
For that, one creates a linear combination of all pressure distributions pau = }_ cj: pj (x, y) 
and adjusts the constants of each term according to the inner and ambient pressure at the 
boundaries. With the resulting pressure distribution pazı, one can then derive a formulation 
for the respective velocity distribution uj); over the surface. One can model the velocity 
values with a known permeability K of the electrode material. 

An example for pa and uqi/K has been calculated for 30 circular areas of suction, 
which is illustrated in Fig. 3 in a boundary of 5 to 6 cm, representing the surface of a suction 
insert, that could be used in a draw-off roll similar to that in KontiBAT or HoLiB [3, 4]. The 
pressure near the 30 suction areas with diameters of 3.5 mm is almost as big as the assumed 
inner pressure and increases faster with a shorter distance to the boundary. 


4 Evaluation of the Model 


An experiment is conducted to evaluate that vacuum-based handling of porous electrode 
AM follows generalized Darcy’s law. In addition, the measured date gain permeabilities 
of the reference electrodes and tune the presented flow model. Section4.1 introduces the 
experimental setup and measures to reduce recorded data. Section 4.2 discusses the results 
of the experiment. 


4.1 Experimental Setup and Data Reduction 


Within the scope of the investigation of the influence of the gripping parameters, pressure 
difference Ap, suction surface geometry 7;, ro on the surface quality were examined. In 
the experiment, the proposed models were examined according to Eq. (3). For this purpose, 
samples of the reference electrodes were placed on the vacuum suction cup under different 
Ap while measuring the volumetric flow rate Q. For three different vacuum suction cups, 
a Ap of © 30mbar and ~ 200 mbar were chosen, according to the resolvable range of the 
following sensory. 

The thermal flow sensor Festo SFAH measures the volumetric flow rate Q. The differential 
pressure sensor module Beckhoff AEM3712 detects the pressure difference A p between the 
pressure of the fluid p;, and the ambient pressure po. 

Four reference anodes and four reference cathodes were cut into six squared samples, 
with edge length 2 - 79 of the vacuum suction cup. Each sample was placed on a vacuum 
suction cup a the test rig. A pressure difference was applied, and the volumetric flow rate was 
recorded. Afterward, the samples’ and current collectors’ thickness were measured with the 
micrometer screw gauge. The measured thicknesses were used to determine the thickness 
of the electrode material h, necessary for the proposed modeling (Eq. (3)) and sketched in 
Fig. 2. 


Flow Modeling for Vacuum Pressure-Based Handling ... 313 


At the beginning of the suction, the vacuum suction cup’s available volume of the movable 
circular bellow is emptied, as shown by volumetric flow rate measurements (illustrated for 
a cathode sample in Fig.4). With the identification of the asymptotic volumetric flow rate 
(Ò = 0), the average volumetric flow through the electrode is determined. Since A p in the 
regions of Ò ~ 0 is considered to be almost constant, the pressure difference A p(Q ~ 0) 
is also averaged, and the standard deviation is formed. 

Subsequently, the averaged asymptotic volumetric flow rates are plotted over the averaged 
pressure difference as per Fig.4. A linear least-squares regression is performed for each 
vacuum suction cup. The average dynamic viscosity on the measurement day was 18.2 + 
0.1 - 1076 Ns/m?. The size of the electrode permeability K is calculated from the gradient 
of the fits, as per Eq. (3) In that case, the cathode’s permeability is K ~ 1773 +230 mD and 
for the anode K ~ 2320 + 613 mD, neglecting the values for the ESS-20 vacuum suction 
cup. 


4.2 Results and Discussion 


In an experiment, three vacuum suction cups with different geometries were applied at 
different pressure differences to electrode samples while measuring the volumetric flow rate 
simultaneously. The calculated regressions of the measured volumetric flow over the pressure 
difference (as shown for the anode as per Fig.4) increase with the pressure difference. It 
is noticeable that, contrary to the expected course of the permeability determination curve 
(Fig. 2), the measured curve of the smallest vacuum suction cup (ESS-20) is significantly 
steeper. This is attributed to the discontinuities on the vacuum suction cup’s contact surface, 
which increase the distance to the contact surface. The slopes of VASB-40 and VASB-55 are 
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Fig.4 Pressure difference and volumetric flow rate over time (left), volumetric flow rate over pressure 
difference of reference anode (right) 
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positioned relative to each other as expected as per Eq. (2). The asymptotic volumetric flow 
rates are often above the regression curve in the low-pressure range. This may be related to 
the differential pressure sensor’s low measuring accuracy at low measured values. 

The characteristic regression curves support the modeling approach over Darcy’s law. As 
the sensors are comparably inexpensive and part of industrial practice, they are measures to 
determine the permeabilities of electrodes to model the flow for various handling geometries. 

With the model, one can determine K of a reference electrode from relatively simple 
geometry, e.g., a vacuum suction cup, based on Eq. (3). From there, using the approach for 
modeling vacuum effective surfaces in Sect. 3.2, one can estimate the average local velocities 
and pressures in similar electrodes resulting from vacuum effective surface geometry, like 
a surface area gripper, a deflection roll, a conveyor belt, or a draw-off roll. 


5 Conclusion 


The authors began manual development and validation to improve cell assembly processes. 
In order to identify, e.g., handling limits, parameters such as pressure difference were varied, 
and their impact on the electrode surface and LIB quality was investigated. Identification of 
sources of damage to electrodes resulting from vacuum-effective surfaces is laborious and 
expensive. Modeling approaches offer the possibility of reducing the time for experiments 
and increasing development efficiency. 

For numerical modeling of damages from the interaction of electrode and handling and 
transport system, knowledge of the effective surface’s geometry and the volumetric flow 
rate resulting from the applied pressure difference are required. 

In this article, a pressure and velocity distribution model for the flow through electrodes 
during handling and transport was developed for vacuum suction cups and vacuum-area 
effective surfaces. It models pressure and velocity based on the porosity and permeability of 
the electrode, the pressure difference, and the handling system’s geometry. The models in 
this article enable the creation of a simulation model to represent the interaction of electrode 
materials and fluids. 

The presented model improves the development of electrode handling and transport pro- 
cesses at several levels. The model maps local effects on the electrode from the vacuum effec- 
tive surface geometry. Thus, the model allows identifying critical areas of the electrodes for 
effect characterization (e.g., electromagnetic or electrochemical), which cuts non-valuable 
experiments. In addition, the model can be used to derive the effects of handling and trans- 
port on electrodes via measurements of porosity and permeability as early as the design 
of the electrodes in the laboratory. In addition, the model can be used to check whether a 
leakage from a non-optimal contact situation is present during handling and transport. 

Our experience shows that modeling local stress and damage of the AM from handling 
and transport is possible. The presented approach allows to model the local aerodynamic 
and adhesion load to the AM particles in conjunction with other techniques, but the process 
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must be optimized. The local stress and damage modeling remain to be discussed as it is 
beyond the scope of this article. 
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Abstract 


Current industrial trends show an increasing demand for individualized products, which 
require highly flexible yet automated production systems. Universal handling systems 
offer an efficient solution for the flexible and safe handling of differing component 
geometries and shapes. An innovative gripper for form-flexible handling combining 
vacuum systems with the flexibility of granulate grippers was established in previous 
research and has continued to prove its flexibility by gripping wide varieties of object 
geometries. The current challenge of modelling and predicting gripping forces for this 
new gripper is addressed in this research. Multiple object geometries are selected and 
examined, with the parameters affecting the air permeability being the most important 
influence for the gripping forces. Along with an overview of influencing factors and 
parameters, a framework for a linear model enabling the prediction of gripping forces 
for different object shapes is developed. The basis for automated prediction of gripping 
strengths for different types of objects is established with this research and could be 
adapted with other, non-analytical models such as machine learning in the future. 
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1 Introduction—Challenges for Universal Handling 


The automation of handling processes is one of the key features and challenges for mod- 
ern production and assembly. In large-scale automation processes with many identical 
objects, handling tools are adapted or specified for individual components. As an exam- 
ple, this can be done by choosing a suitable vacuum gripper size for the accessible object 
surface or adapting gripper fingers of mechanical grippers to the object contours [1]. For 
industrial applications with shortening development cycles of new product variants and 
sometimes even overlapping transitions between product generations, solutions for a 
multifunctional, flexible handling or gripping of different components are required [2]. 
According to Hesse [3], three different types of flexibility for automated hand-ling pro- 
cesses can be defined: 


e Functional flexibility 

Integration of multiple handling operations, such as draping with handling 
e Disturbance flexibility 

Automatic correction functionality in case of malfunction or misorientation 
e Object flexibility 

Handling large spectrums of parts with the same gripper 


Especially for smaller batches of varying or even individualized products, the necessary 
versatility and range of grippable objects can be achieved with universal grippers, which 
enable adaptable and flexible automation of these handling processes. In order to assess 
the suitability of a universal gripper for a spectrum of grippable objects, the respective 
effective gripping force and success frequency has to be examined in order to validate a 
grippers applicability [4]. The main focus of this publication is the development of such 
an assessment for a universal gripper created at the Technische Universität Braunschweig 
[5] and to enable an analytical model to predict the respective gripping forces for differ- 
ent types of object contours. 


2 State of the Art 


The current state of the art describes multiple solutions for flexible gripping of different 
kinds of objects. One possibility for universal gripping is the combination of multiple 
different grippers into one multi-effector system, which is able to choose a suitable grip- 
per from a selection of pre-installed specialized grippers. This can lead to high weights 
as well as somewhat bulky effector setups, but allows effective handling of multiple pre- 
selected types of objects (Fig. la). Another possible solution are effector changing sys- 
tems. The applications are similar to multi-effector systems, as the spectrum of objects 
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a) Multi-effector systems c) Adaptable grippers 


b) Effector changing systems 


Fig. 1 Solutions for gripping different objects [6-9] 


also has to be exactly determined beforehand, as a suitable gripper has to be available for 
the required tasks (Fig. 1b). 

Actually adaptable grippers (Fig. 1c) suitable for multiple shapes and object types can 
often be assigned to the field of soft robotics. Typical examples are usages of adaptable 
surfaces, such as FinRays [9-11], often in combination with other gripping principles 
such as mechanical and electrostatic mechanisms [12]. 

One of these soft robotic applications, that has been gaining traction in the past few 
years, are robotic grippers based on the jamming of granular materials [13, 14]. These 
grippers use airtight cushions filled with different granular materials. When a vacuum is 
applied to these cushions, the granular material compacts, jams and enables a gripping 
force to be exerted on different kinds of objects through friction and interlocking with 
the respective surfaces. For these grippers, influences such as the stiffness of cushion 
membrane material, the granulate material, the object enclosure and the conditions of the 
granulate material have been examined [15-18]. 


Previous Works 
Based on a combination of the granulate grippers with a vacuum gripper, an innova- 
tive handling principle for form-flexible handling was developed at Technische Univer- 
sität Braunschweig. The key difference to standard granulate grippers is a porous area in 
the gripping cushion, which allows an additional vacuum force to build up (Fig. 2). 
Previous research has shown that this combination of gripping principles can achieve 
a high adaptability for gripping mechanisms and therefore a capability for handling large 
varieties of objects [20-22]. This previous research also examined influences of the air- 
flow rate and vacuum on the state of the granulate material, with a certain solidity of the 
granulate being reached for the best vacuum seal and thus the highest vacuum. In this 
state the gripper combines the two gripping principles most effectively and the highest 
gripping forces are reached [5]. The key advantage compared to previous research on 
granulate grippers is the combination of the ability to grip flat objects [14] with the large 
increase in adaptability achieved with the granulate gripping principles. 
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a) b) Connected to robot Vacuum Sensor 


Connector/Vacuum adapter 
Base frame 
Separation grid 


Safety fleece 

Granular material 

Airtight cushion 
membrane 


Fig. 2 Innovative vacuum-based granulate gripper a Example for this gripper [19], b Schematic 
and structure of the gripper 


3 Analytical Model Frameworks for Gripping Forces 


The resulting gripping force is influenced by different factors, which are described in 
Fig. 3. The influences are divided into three categories, originating from the applied grip- 
per, the grasped object as well as the influences of the gripping strategy. 

Under optimum conditions for all of these influencing factors, the porous area in the 
gripper cushion is fully sealed with the grasped object and a maximum vacuum gripping 
force can be applied. The theoretical maximum of this vacuum-based gripping force is 
calculated with the following formula: 


F=Ap-A (1) 


The gripping force F results from the applied vacuum Ap and the covered surface area 
A of the grasped object. For this research, the influences of the grasped objects are the 
main focus. 


gripper design grasped objects gripping strategy 


gripper dimensions object form initial contact force 


gripper membrane 


(material, stiffness etc.) object size/dimensions trajectory 


granular material object material and 


(elasticity, size, filling level etc.) surface roughness applied vacuum 


Fig.3 Influencing factors for the achievable gripping strength with the examined vacuum-based 
granulate gripper [5, 22, 23] 
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Experimental Setup 

In alignment with preliminary experiments, a cylindrical gripper with a diameter of 
150 mm and a height of 60 mm was chosen (see Fig. 2). The membrane consists of 
1.25 mm Polyurethane, the porous area with a size of ~4500 mm? extends to a maximum 
diameter of 95 mm and is arranged symmetrically around the center axis. As granu- 
late material,~4100 ABS (Acrylonitrile-Butadiene-Styrene) beads with a diameter of 
6 mm were used, filling 66% of the maximum cushion volume. The gripper is mounted 
to a K6D40 force sensor with a maximum force in Z-direction of 500 N, a Kuka LBR 
iiwa 14 R820 and coupled with an adjustable vacuum pump Variair Unit SV 201/2. The 
maximum compressor power of 4 kW results in a maximum pressure difference of up to 
0.42 bar. The pressure difference is measured by a VS VP8 SA M8-4 with a range of — 1 
to+8 bar mounted close to a valve between the vacuum pump and the gripper. 

As the gripper shows similarities to approaches with standard vacuum grippers, a sim- 
ilar gripping and motion sequence is applied in this research and the measured forces 
and pressure differences are shown in Fig. 4. After positioning the gripper directly above 
the clamped test object, the gripper is moved perpendicularly to the object surface until 
a previously defined initial contact force is reached. No manual positioning or external 
influencing of the gripper is applied, even though this manual intervention has achieved 
a form fit for standard granulate grippers in the past [14]. With the gripper being posi- 
tioned directly on the test object, a valve to the air compressor is opened and a vacuum 
is generated in the gripper. After a delay of 2.5 s, the maximum vacuum with the applied 
compressor power is reached and the gripper is pulled vertically upwards at a defined 
pull-off-speed. The maximum force at which the gripper detaches from the objects sur- 
face is used in the next chapters for the analysis of influencing parameters. 


Experimental analysis of influencing parameters 

The main goal is to model the influences of different objects and geometries on the pos- 
sible gripping forces using the specified gripper. For an optimal setup of objects, the 
influence of the object material, surface roughness and air permeability as well as the ini- 


Opening Valve Max. Gripping Force Closing Valve 


— Force in Z-direction 
Vacuum 


Force in Z-Direction [N] 


o` 1 2 3 4 5. 


Initial Contact Force Time [s] 


Fig. 4 Experimental procedure for an initial contact force of 80 N and 50% compressor power 
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tial contact force has to be quantified. For this, the type of material as well as the surface 
roughness and the initial contact force was examined (see Fig. 5). Five different convex 
cylinders with a diameter of 242 mm made from realistic materials such as aluminum as 
a representation for milled parts, polyurethane and paper for parcels and packaging and 
PLA (polyactic acid) for synthetic components were used. Fifty experiments with a ris- 
ing contact force between 20 and 280 N and a constant compressor power of 50% were 
carried out on the curved surface of the convex cylindrical surfaces. 

Resulting from these experiments, initial contact forces between 50 and 200 N prove 
to be most applicable, as the maximum gripping force falls off for initial contact forces 
below 50 N and scatters broadly over 200 N. In this range for initial contact forces, the 
mean and variance values of the different objects are quite comparable with the highest 
deviation between the mean values being under 5%. As a result of these experiments, 
the objects used for the further research were manufactured additively with the settings 
achieving the surface quality of the “smooth PLA” (Fig. 5), as this enables a time-effi- 
cient and precise design for complex geometries. 

As a secondary examination, the air permeability of the objects as well as the influ- 
ence of the compressor power and resulting pressure difference is analyzed (Fig. 6). For 
this, air permeable rotationally symmetrically perforated flat surfaces with a gripping 
area of 60, 80 and 90% were prepared and compared to a 100% airtight surface, a con- 
stant initial contact force of 50 N was used. Due to air flow effects of different sized 
porous openings, not all air permeable surfaces will show the exact same resulting grip- 
ping forces. Therefore, these experiments serve as a reference for the assessment of the 
influence of air permeability. 

As seen in Fig. 6, compressor power under 40% results in a low pressure difference 
for all test objects. For higher levels of compressor power, the air permeability of 60 and 
80% fail to achieve a pressure difference of over 0.1 bar, the maximum gripping forces 
are below 25 N. As the experiments for over 90% are able to achieve a higher pressure of 
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Fig.5 Resulting maximum gripping forces for convex cylinders with a diameter of 242 mm with 
different surfaces, materials and initial contact forces. The mean and variance values in the marked 
area between 50 and 200 N are shown in the table 
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Fig.6 Four steps of equally distributed relative coverage of the porous areas by flat surfaces. 
a Relative compressor power over pressure difference. b Maximum achievable gripping force over 
pressure difference 


over 0.1 bar as well as a gripping force of over 50 N, this range of porosity is defined as 
a minimum requirement for the grippability of objects. For the highest achieved pressure 
differences, a comparably large spread for the datapoints is observed. This is presumed 
to be a result of a squeeze-out-effect pushing the granular material through the outer 
membrane, creating a structured surface with reduced contact to the surface. 

After analyzing the influence of material and air permeability, a multitude of objects 
made from “smooth PLA” are used in order to examine the influences of different geom- 
etries. The surfaces of these example geometries are airtight, experiments resulting in 
gripping forces below 50 N are considered failures and can be classified as the gripper 
not being able to achieve a seal with the surface with an air permeability of over 90%. 
An exemplary extract for flat surfaces, convex edges and concave cylinders is shown in 
Fig. 7. Visible is a seemingly linear correlation of the maximum gripping forces with the 
pressure difference as well as a clear difference in the slopes for the different objects, 
resulting in different maximum gripping forces for high pressure differences. 
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Fig.7 Maximum gripping force over pressure difference for three example objects 
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Empirically adapted analytical model 
Using the linear dependence of resulting vacuum forces on the pressure difference 
shown in formula | as well as the surface of the porous area of the gripper cushion of 
Apor = 4500 mm’, a theoretical achievable maximum vacuum gripping force of 189 N at 
a maximum pressure difference Ap, of 0.42 bar is calculated. However, experiments 
have shown a larger mean gripping force of 250 N for an airtight flat surface with this 
pressure difference. Influences from the granulate gripping principles are not applicable, 
as previous research has shown no effect on flat surfaces for purely granulate grippers. 
As another influencing factor, the pressure difference at the porous surface will most 
likely differ somewhat, as the vacuum sensor cannot be feasibly located there. However, 
the used vacuum pump is not able to create a pressure difference of over 0.42 bar and the 
sensor is able to measure this maximum value, so this difference in force is most likely 
not only a result of a deviation from the measured pressure difference. Therefore, this 
discrepancy between the theoretical and measured gripping forces is empirically approx- 
imated by a larger surface area being under the effect of the pressure difference than the 
actual porous area of the gripper (see Fig. 8). 

A theoretical effective surface area of ~6000 mm? can be calculated with formula 1, 
which translates to an effective circular area with a diameter of 87 mm. The theoretical 
maximum gripping force F (~17,668 mm?) 


tmax 
with the maximum diameter of the gripper of 150 mm and AP „ay is 742 N. For more 


for a greatest possible affected area A pax 
complex objects and geometries, the gripping forces resulting from the granulate grip- 
ping principles such as friction have to be considered. However, specific forces result- 
ing purely from the granulate gripping principle cannot be distinguished, as the vacuum 
gripping force cannot be avoided. Therefore, a combined correction parameter Combined 
for the influence of the object geometry on granulate as well as vacuum-based grip- 
ping forces is introduced in formula 2. This correction parameter is defined as a value 


between 0 and | (see formula 3). 


F= Ccombined 7 Ap 2 Atmax (2) 
Sobject 
Ceom ined = 7 
bined Spa (3) 


Gripper with 
Airtight membrane 


"hees " 
Porous area 
Areas resulting in Apor Actual effectivearea —~Test object 


Fig.8 Approximation for the empirical area. a Model for only A or being in effect. b Model for 
effective area 
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Sopjeci is calculated as the slope of the linear approximation (F/Ap, see Fig. 7) of the 
achievable gripping forces over the pressure difference. S18 the theoretical maximum 
achievable slope calculated with F „ax As an example, this results in a C „pineg OF 0.330 
for the airtight flat surface previously examined. 

In an ideal setup of an airtight flat surface, C_ pin. represents a factor for the effec- 
tive pressure difference as well as the effective area, since this setup is only affected by 
the vacuum gripping principles. For 3D-objects, C ompineg represents the influences of the 
granular as well as vacuum gripping principles. For more complex objects, some parts 
of the porous area will not be perpendicular to the gripping trajectory, which reduces the 
effective gripping area. However, no direct correlation between the perpendicular area 
and C, »bineq 18 Observed, as the minimum requirements for shapes are not identical for 
convex and concave surfaces. Therefore, formula 3 continues to use A pay An overview 
for the resulting values for C, „pin. approximated over 30 experiments with a linear dis- 
tribution of compressor power between 33 and 100%, an initial contact force of 80 N for 
different object shapes, which should enable a broad overview for most common geom- 
etries as well as specific requirements for a grippability is shown in Table |. The require- 
ments all result in a sealing of the gripper with the surface of over 90% (see Fig. 6) and 
thus a gripping force of over 50 N. The grippability of concave cylinders, spheres and 
cones is mostly influenced by the size of the objects opening for the base frame of grip- 
per to fit, so this is not shown further. The correction parameters shown in Table | prove 
a variety of gripping strengths for different objects with a regression quality of over 0.9, 


which is evidence for a good approximation. 


Influence of Scale and Diameter 
Remarkable is a somewhat low influence of the scale of the object. Starting at the speci- 
fied minimum requirement shown in the last column, the correction parameter rises 
somewhat insignificantly until a sufficient similarity to a flat surface is reached (see 
Fig. 9). A cylinder diameter of 200 mm, more than double the initial value of 90 mm has 
a slightly higher slope and thus a slightly increased C ombine However, up to a diameter 
of 160 mm almost no difference is visible. A merged calculation of C pin.g for the five 
shown convex cylinders results in a value of 0.311 with an R2 of 0.907, which differs less 
than 5% from the calculated C.,,pineq for a diameter of 90 mm. 
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Fig.9 Gripping forces over pressure difference for different diameters of convex cylinders 


4 Conclusion and Outlook 


The main goal of this research was to show a dependence of gripping forces on object geom- 
etries and to formulate an analytical model for calculating the maximum possible forces for 
these geometries. This is done to gain an understanding of the applicability of this flexible 
gripping solution. These goals were achieved and a linear approximation showed high regres- 
sion quality for a spectrum of different objects. Minimum requirements for the usage of this 
model are the defined geometric characteristics, which result in a sealing of the gripper with 
the surface of the object of more than 90%. This enables further applications in combination 
with approximations of objects with similar shapes and geometric features. This could be 
done by comparing objects to previously tested data sets and interpolating C pin. oF through 
a utilization of machine learning with a 3D-camera. This would enable applications for a grip- 
ping force prediction for unknown objects on the basis of this research. Further expansion of 
the formula is possible for variations in the gripper configuration, previous research has shown 
some influences of parameters such as granulate size, material of the membrane etc., which 
will be examined in further research. Additionally, the current approach is limited to a basic 
vertical gripping strategy, other, more complex trajectories might result in a better sealing of 
the gripper cushion with the object surface and thus a higher possible gripping force. 
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Abstract 


Automated handling of technical textiles poses major challenges on modern handling 
systems. Previous research has shown that using suction grippers is feasible for han- 
dling processes involving textiles. However, separating individual sheets of air-perme- 
able materials from a stack using such grippers is a nontrivial task. This paper details 
an automated stack singulation process using low pressure suction grippers leveraging 
online data from differential pressure sensors to control the singulation process. Sub- 
systems are analyzed to derive a governing model representation of the process. This 
model is deployed on a robotic test rig validating the process in experimental analysis. 
Using this approach, a controlled singulation process for stacked carbon fiber mats 
has been achieved with a success rate exceeding 99% showing the practicality of con- 
trolling the internal suction pressure for advanced handling processes using low pres- 
sure suction grippers. Further improvement could be achievable by incorporating and 
fusing multiple sensor principles. 
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1 Introduction and Related Work 


Fiber reinforced plastics (FRP) are, due to their good weight specific mechanical proper- 
ties used in high performance applications with minimum weight requirements [1, 2]. 
However, the production of FRP parts, is labor intensive with little automated processes 
in place [3]. Especially low bending stiffness of woven fiber mats, as well as their fragile 
nature lead to challenges in robotic handling. 

For efficient processes, it would be preferable to start a production process from a 
stack of raw fabric plies, as this allows the cutting and preparation in bulk [4]. When 
using such a stack, the first and foremost task is to separate single sheets of fabric mate- 
rial from this stack for further handling and manipulation. While processes such as pinch 
gripping or freeze gripping have been shown to be capable of singulating single plies 
from a stack, they can lead to undesirable alterations on the gripped ply or the remaining 
stack underneath [5]. 

In contrast, low pressure suction grippers are capable of handling even fragile fabrics 
without damage [6]. However, the porous nature of most textile materials makes stack 
singulation very difficult due to the second or third ply still being effected by the grip- 
pers’ vacuum suction [7]. 

Cubric et al. have successfully used vacuum grippers for grasping textile materials, 
however, they concluded: ‘Jt has also been found that the application of this vacuum 
gripper is not suitable for taking one layer of fabric from material bundle’ [8]. 

In further research, it was shown that parameters such as area mass density, gripper 
position, suction cup geometry, and supply pressure have a major influence on the suc- 
cessful handling of non-woven textiles. However, no attempt has been made to quantify 
or model any of these influences [9]. 

The author in [6] has successfully used a vacuum suction gripper for separating sin- 
gle plies of woven carbon fiber mats from a stack, by controlling the electrical contact 
resistance of the carbon fiber pressing against the suction cup. The main deficit of this 
approach is its dependence on the conductivity of the handled materials. 

In this paper, a robotic gripping system capable of reliable single ply separation for 
woven technical textiles is presented, talking the aforementioned deficits. To achieve this 
task, the gripping system is split up into relevant subsystems and their influences on the 
gripping process are modeled, allowing the generation of a model capable of predict- 
ing the suction pressure inside the gripper depending on chosen process parameters. This 
model in turn enables: 


e Measuring the number of plies adhering to the gripper at any given time 
e Determination of a suction pressure at which the gripper will be most likely to grip a 
single ply from the stack 
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2 Gripping System Overview 


The main end effector used in this research consists of four Schmalz SCG 1xE100 low 
pressure suction grippers. Every individual gripper element (see Fig. 1) consists of: 


e A vacuum generating Coanda ejector. 
High pressure air supplied to the pressure inlet (1) is accelerated through a small slit. 
The high velocity airstream adheres to the outlet walls of the ejector curving away 
from the suction chamber. This high speed airstream leads to a low pressure zone 
inside the chamber (4). 

e A screw-on perforated plate (2) that encloses the suction chamber and ultimately acts 
as the interface to any gripped fabric. 

e A vacuum pressure sensor monitoring the suction pressure py, through a bypass 
opening (3) directly into the suction chamber. 


3 Subsystem Modelling 


Since the pressure in the suction chamber is the only process parameter that can be 
measured in this gripper setup, an attempt is made to model the suction pressure as a 
function of other effective process parameters, as shown by the Ishikawa diagram in 
Fig. 2. 

For a detailed analysis of the gripping related subsystems, the system is divided into 
4 subsystems effecting the generated suction pressure: Vacuum generation, Perforated 
plate, Material Properties and Load-case dependent leakage. 


Fig. 1 Left Stylized 3D Cut through an SCG Gripper, Right Gripper cross section 
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Fig. 2 Ishikawa diagram of major influences on the Suction pressure inside the gripper 


3.1 Vacuum Generation 


The vacuum generated by the gripper is mainly dependent on the input pressure supplied 
to it. This input pressure is controllable from 0 to 4 bar by adjusting a voltage-controlled 
pressure control valve at the input port. Another major influence on the vacuum gener- 
ated by the Coandä ejector is the flow restrictions to the incoming suction airstream. 

In this section the generation of a predictive model is discussed, which allows the 
determination of the suction pressure inside the gripper py at any given time dependent 
on the input pressure py and the volumetric flowrate of the suction airstream Qy. 

To determine this vacuum generation characteristic, the volumetric flow rate of air Qy 
entering the gripper is recorded, as well as the suction pressure py, while varying the 
supply pressure and placing different flow resistances in the suction path (see Fig. 3 left). 
To ensure a smooth almost laminar airflow at the location of measurement for volumetric 
flowrate, a 30 cm long pipe is added to the bottom of the gripper-restriction assembly. 

This generates a three dimensional dataset of 200 specific operating points. The linear 
regression model fitted to this data set yields a characteristic equation (see Fig. 3 right): 
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bar? 
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aıı = 0.3824 Darl ‚san = OL 


’ 
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With an adjusted coefficient of determination of: Ro = 0.9299 
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Fig.3 Left Cross-section through the experimental setup for the measurement of vacuum genera- 
tion characteristics, Right Interpolated vacuum generation characteristic model 
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Fig.4 Left Perforated plate geometries used in this study, Right Accompanying characteristics. 
All subplots are equally scaled py = 0..10000Pa and Qy = 0..6001/min 


3.2 Perforated Plates 


The suction chamber of these grippers is enclosed by a perforated plate, which also acts 
as an interface to the gripped textile. The design of this interface however is not predeter- 
mined, allowing for adjustments specific to any use case. To determine the influence of 
these suction plates, the experiment described above is repeated with 9 different suction 
plate designs shown in Fig. 4. These suction plate designs vary in the size of individual 
orifices Dy as well as the combined cross sectional area of the orifices. 

By 3D Printing these test specimen and measuring the volumetric airflow through 
them at varying suction pressures a characteristic resistance graph for every suction plate 
can be determined, as shown in Fig. 4 right. 

The literature on airflows through perforated plates [10, 11] states that the pressure 
loss of lamina flow through such a plate is supposed to be: 
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1 
PAmbient — Ps = 5 EupVv" 


with 


Ps Vacuum pressure in the suction chamber 
Eu Euler number/ pressure loss coefficient 
p Fluid density 

V pipe bulk-mean velocity 


With Q?, being proportional to V? and py as the left hand pressure differential the equa- 
tion can be restated with a complex pressure loss coefficient & under the assumption of 
constant fluid density (which nearly holds apart from changes in environmental condi- 
tions) as follows: 


2 
pu =§Qy 
While an increase in plate porosity clearly coincides with an increase in airflow and 


consequentially reduction in suction pressure, the influence of the hole diameters is not 
monotonous. 


3.3 Air-Permeable Textiles 


An airstream through a porous media, such as a woven fabric, leads to a pressure drop, 
which can be described by the following expression [12]: 


Ap = ae To 


With 


Ap the pressure loss across the media 
Cı linear loss coefficient 
C2 quadratic loss coefficient 


With low flow rates Qy the quadratic term can be dropped, leading to the equation used 
in the norm DIN EN ISO 9237 [13]. 

For any material used the air permeability R can easily be determined as described by 
the norm [13] by measuring it at a certified test stand. This value R is the area normal- 
ized airflow through a textile at a 200 Pa pressure differential. Therefore, the Value of Cı 


can easily be determined and substituted into the pressure differential equation above: 
C= ae Ap = aoa _ Ov 


"Ar 
T this oA with the pressure drop across the suction-plate yields: 


200Pa ee 


Ap = 
PER 


+407 
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This characteristic can be intersected with the vacuum generation model shown in Sect. 
3.1. This approach however achieves only moderate success in correlating to real meas- 
ured data as shown in Fig. 5. 

While this theoretically derived model shows some general correlation to the over- 
all characteristics of the gripper, one cannot assume perfect prediction of any interaction 
and therefore the perfect prediction of suction pressure inside the gripper is not achiev- 
able without the use case specific experimental determination of gripper characteristics. 


3.4 Leakage Currents 


As shown in Fig. 1 (right) deformations in the fabric can lead to air streams not com- 
pletely passing through the fabric. These stray air currents will not apply holding forces 
on the textile and therefore play a huge role in grip security and the separation of textiles 
from the gripper at low supply pressures. This subject is addressed by introducing a sca- 
lar load equivalent term which will be obtained by a simple static mechanical simulation. 

For the generation of such a scalar load equivalent value 6 limp materials are selected 
and their mechanical properties needed for simulation are determined, as shown in 
Table 1: 

Figure 5 (right) shows the good match of simulated cantilever test with the real one 
for one example material. 

To quantify the deformations of a single textile sheet with a single scalar value, a cir- 
cular path around the gripper is defined where the deformation of the simulated textile is 
measured around the circumference of said path d(s) with the normalized path length s. 
This in turn allows the calculation of the load equivalent term L. 


1 
L= f d(s}?ds 
0 


Table 1 Material properties and method of acquisition 


Symbol Designation Method 

PA Area Mass Density | Weighing of square samples (150 mm) based on DINEN 
12,127 [14] 

a Thickness Thickness measurement loosely based on DIN EN ISO 
5084 [15] 

Ep EL Elastic moduli Cantilever test loosely based on DIN 53,362 [16] 

v2 Poisson’s ratio Estimation based on literature [17] 

G12, G13, G23 | Shear moduli Matching simulated deformations to physical test samples 
based on [17] 
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Fig.5 Left Suction pressure over supply pressure for 9 different suction plates. Orange line cal- 
culated model, Blue line: smoothed measurement data. Right Cantilever test for the determination 
of simulation parameters and simulated deformations in good agreement (local Von-Mises-Stress 
shown in false color mapping) 
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Fig.6 Left Correlation of Load-Equivalent L and separation-pressure sıo, Right Five of our consid- 
ered example load cases 


While these deformations have little effect on the suction pressure inside the gripper 
when the textile is securely gripped, they show big influences on the separation of lay- 
ers from the gripper. To verify this load equivalent 8 different load cases were defined 
with varying material geometries and gripper configurations. Five of these variations 
are shown in Fig. 6 (right). After grasping the material the operating pressure is reduced 
until the ply drops off. Recording the supply pressure at which separation occurs sig a 
correlation between sıo and the load-equivalent-term L can clearly be seen in Fig. 6 (left). 

While the specific linearity factor is dependent on the selected suction-plate and mate- 
rial combination this linearity factor can be determined experimentally and can later 
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be used to determine separation-points for a minimal pressure at which a single ply is 
gripped successfully, thus being a good starting-point for a minimal supply pressure for 
stack singulation. 


4 Robotic Workstation 


As mentioned above, the goal is to proof the concept of robotic single ply separation 
from a given textile stack. Therefore, a robotic test stand is built made up of a UR5e 
cooperative robot and a gripper assembly, containing 4 low pressure suction grippers 
each equipped with a —1 to 1 bar differential pressure sensor, which can be seen in 
Fig. 7 Control structure for the robotic workstation left. 


4.1 Control Structure 


To control the test-stand, an OPC-UA server running on a Beckhoff PLC is used as well 
as a control backbone of the singulation routine on a connected personal computer run- 
ning a Matlab control code. The control architecture can be seen in Fig. 7. 

Before programming the routine, the experiments described in Sect. 3 are conducted 
to generate: 


1. A model on the suction-pressure expected when gripping a single layer, which is used 
for the determination of the number of gripped materials. 

2. A material model useable in simulation. 

3. A model on the minimum supply pressure where a single sheet just so adheres to the 
gripper based on the simulation derived load equivalent at this gripper 


These experimental results can be used in the following control strategy for stack separa- 
tion: 


Beckhoff PLC u Matlab Control 


Slave * Slave Master 

Movement control ||+ Data broker Data Analysis 
Force feedback — * Pressure control Data-Interpretation 
stack detection valve activation Movement request 
* Sensor readout Algorithmic control 


OPC UA OPC UA OPC UA 
Client Server Client 


Fig. 7 Control structure for the robotic workstation 
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1. Move to a location just above the stack. 

2. Move down until contacting the stack. 

3. Activate every gripper with a supply pressure just above the minimum supply 
pressure for single layer handling. Obtained from analysis described in Sect. 3.4 

4. Slowly move upwards clear of the stack. 

5. Increase the supply pressure in every gripper to a detection pressure (e.g. 3 bar) 

6. Analyze resulting suction pressure and compare it to the expected value for sin- 
gle layer handling. 

7. If the values of step 6 differ, either discard all gripped sheets or replace them 
atop the stack 

8. Else, lower the supply pressure to a safe handling pressure (e.g. 1 bar) and 
move the single gripped sheet to its destination. 


4.2 Experimental Validation 


Performing the control process described above it is possible to repeat the stack separa- 
tion process indefinitely by moving a stack to a new location and back again. 

In a test with a sample size of 500, it was possible to successfully detect and separate 
a single material layer from the stack 99.6% of the time. Both errors were due to the 
material sticking to the stack when lifting the gripper at step 4. This error is recognized 
at step 6 and could have been corrected in a production process. 

The same series of tests for separating two layers of material together was successful 
in 66% of the cases. Problems observed when going for multiple layer handling were 
mainly: 


e The gripper grasping three materials instead of two without realizing this mistake at 
step six, due to large variations in the suction-pressure with more than one material 
adhering to the gripper. 

e The second grasped layer dropping off of the gripper at the initial lifting procedure at 
step 4 (this error was successfully) 


5 Conclusion and Outlook 


In this work, attempts at modeling the governing interactions in a low pressure suction 
gripper when handling porous materials are presented. 

While it was not possible to generate a fully parametrical model for predicting the 
pressure inside a suction gripper at any given time, the approach nonetheless allows 
to derive a process for automated stack separation with an overwhelming success-rate 
of >99%. 
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This level of confidence would even allow for usage of this process in real production 
environments. Multiple challenges came up when going for multi-layer handling opera- 
tions, while these might not be as important for production facilities as single ply separa- 
tion, the furtherance of the understanding on those processes remains an interesting field 
of research. 

Incorporating and fusing other sensor principles, such as immediate force measure- 
ment at the Gripper-Textile Interface as well as optical feedback could further improve 
the success rate and error tolerance for such delicate handling processes. 

Furthermore incorporating artificial intelligence into the control structure could 
improve results and might allow for dynamic reaction on slightly alternated environmen- 
tal parameters. 
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Accuracy Examination of a Flexible Pin Gripper 
for Forging Applications 


Caner-Veli Ince, Jan Geggier and Annika Raatz 


Abstract 


Nowadays, cost reduction in manufacturing is getting relevant. One aspect to achieve 
that is utilising universal handling systems and their ability to adapt to changing objects 
and various geometries. By that, they minimise the number of handling systems and 
set-up times whereby cost savings are realised. In the field of forging, the objects vary 
their shape several times during the manufacturing process. In addition, the temperature 
can rise up to 1200°C during the different steps of the forging process. Current flexible 
handling systems cannot handle those temperatures. The main reason for that is the 
material they consist of, primarily elastic polymers. Hence, there is a need for a handling 
system to close the gap between form flexible and high temperature handling. For this 
purpose, we developed such a handling system in our previous work, consisting of two 
jaws with pins in a matrix arrangement. Each pin can move in the longitudinal direction 
and adapt to different shapes. In response to the current temperatures for the pins, a 
material is used that withstands high temperatures. This paper presents the actuation and 
control of the developed handling system. The system is actuated by pressurised air which 
is continuously controlled to counteract the thermal expansion of the air caused by the 
high temperatures. Therefore, we integrate intelligent valves to fulfil the automation and 
control. Finally, we evaluate the accuracy of our system and optimise the valve control. 
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1 Introduction 


Formvariable handling systems can adapt to different geometries of the handling objects and 
fulfil various tasks and operations. Shintake et al. reviewed the so called soft grippers that 
consist of polymer material [1]. The polymer material is advantageous because of its elastic 
behaviour. In combination with several physical effects like granular jamming [2] or Fin 
Ray Effect [3], a universally usable handling system is realised. In contrast to the benefits of 
the elastic polymer material, the max. operating temperature of 300°C limits the usability 
of a soft gripper [4]. For most use cases, this temperature is not exceeded. In this work, the 
application in the forging sector is considered. Here, the objects undergo massive geometric 
changing processes and reach temperatures up to 1200°C. The workpiece of a bevel gear, 
for example, has a simple cylindrical geometry. After forging, the bevel gear is conically 
shaped with teeth on the cylinder surface without parallel surfaces for grasping. A handling 
system that can adapt to the varying shape of the handling object would be beneficial for the 
automation of these processes. Additionally, using one gripper instead of several reduces 
the number of necessary grippers, resulting in a cost reduction. Further, tooling times are 
saved. 

The Tailored Forming Process can, for instance, be considered such a procedure. It 
is a novel forging process investigated in the Central Research Centre (CRC) 1153 at the 
Leibniz University Hannover [5]. The main focus is to develop a process chain to manufacture 
particularly tailored hybrid components consisting of multiple materials. The joining process 
is the main difference between the Tailored Forming Process and other hybrid manufacturing 
processes. Conventional processes place the joining process at the end of the manufacturing 
process, at which state the components have almost their final shape. This fact limits the 
possible geometries for the hybrid components. In contrast, the joining is located at the 
beginning of the process chain in the Tailored Forming Process. Here, the materials are 
merged into semi-finished workpieces with simple shapes followed by the forging. 

To match the forging properties of the combined materials, the workpiece has to be heated 
up. Depending on the different materials, a temperature gradient is necessary and is set by 
induction heating. Currently, the combination of steel-steel and steel-aluminium is being 
investigated. The steel-steel paring requires temperatures up to 1200°C, which defines the 
max. process temperature. 

Several demonstrator components are investigated in the CRC 1153. Their shapes vary 
from cylindrical shafts over a conical bevel gear to a wishbone and are depicted in [6]. The 
bevel gear, for example, has a cylindrical workpiece and undergoes a massive change in 
shape, which is challenging to handle with the same handling system. 

Furthermore, the accuracy of the handling system is essential for the Tailored Forming 
Process quality. Induction heating is used to prepare the workpieces for forging. They are 
placed on the induction coil, which requires high precision. Contact between the induction 
coil and the workpiece could damage the coil. In addition, the workpieces have to be placed 
in an exact position in the forging die. Otherwise, the forging could fail. 


Accuracy Examination of a Flexible Pin Gripper for Forging Applications 345 


The brief overview of the Tailored Forming Process indicates the need for a handling 
system which withstands high temperatures of up to 1200 °C, adapts to changing geometries 
and fulfils the accuracy requirements. 

In this work, a concept for a previously developed form variable pin gripper for use in 
forging environments is investigated further [6]. Therefore, the Tailored Forming Process 
has been introduced to define the boundary conditions. In the following, the functionality 
of the automation of different pin grippers is briefly presented to outline the state of the 
art. Afterwards, the prototype of the previously developed handling system is presented and 
the experimental validation results are discussed. A summary and an outlook complete this 


paper. 


2 Related Work 


This section gives an overview of pin grippers and their grasping process. At the end of this 
section, the gap between the currently available pin grippers and the boundary condition of 
the Tailored Forming Process is pointed out. 

The first gripper of this kind is the Omnigripper by Scott [7]. Scott has an arrangement 
of 8 x 16 pins on two slightly separated plates with the same orientation. The pins can 
move independently in the vertical direction when it comes to contact with an object. The 
Omnigripper lowers over an object, the pins in contact retract, whereby the negative of the 
shape is adapted. The plates then move together and the pins clamp the object. The pins are 
telescopically designed. When the pin retracts by contact, electrical contact is established 
between the inner and outer tubes activating a switch. Thus, the Omnigripper can acquire 3D 
data from the pin positions with this sensing. A host computer starts the gripping process. 
The Omnigripper is attached to a robot, and the host computer order to grasp, release or 
reorient the object. Afterwards, the control is given back to the robot for doing the movement. 
Scott proved the ability to handle a wide range of objects with his work but did not mention 
high temperatures or the reached accuracy. 

Similar to Scott, Mo also developed a pin gripper [8]. Mo’s gripper has pins that move 
independently vertical and the shape is adapted by lowering the gripper over the object to 
handle. In contrast to Scott, Mo’s pins are arranged on one plate, and the grasp is realised 
by active rotational actuation of the pins. Some pins have an elliptical shape, whereby the 
object is clamped. Both of these grippers need an active actuation. 

Meintrup developed a system based on pins that consists of two opposing pin jaws [9]. The 
system is primarily utilised as a manual clamping device without automation. The pins on 
each jaw are close-packed and in contact with each other, and they can move independently 
in the horizontal direction. The jaws can be brought together to grasp an object. Thereby, a 
mechanical system is activated that presses the pins together. The resulting friction between 
the pins blocks their horizontal movement and the pin position is fixed. The activation of the 
mechanical system depends on the position of the jaws. The jaws move over aramp, whereby 
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the clamping is realised. This passive actuation is advantageous for high temperatures but 
not adaptable to changing diameters of the handling objects. 

Kim [10] developed a similar gripper to [6] with two opposing pin jaws. The pins are 
actuated with pressurised air, whereby the jaws are electrically driven. The stroke of the pins 
regulates the process. A resistive foil is attached inside the gripper. If the pin touches the 
resistive foil, its resistance changes and the process is considered completed. 

The systems shown are adaptable to different shapes and diameters. However, none of 
them is designed for the operation in conditions like the Tailored Forming Process or other 
forging processes. Pin grippers do not need to be made of polymer material to achieve their 
shape variability. In addition, the gripper’s jaws can be configured and arranged differently 
allowing further adaptation to the handling task. Electrical grippers or sensors cannot be 
utilised to detect the pin position at high temperatures and have to be adapted. Due to those 
facts, the pin system is investigated further to realise a handling system for hot forging 
workpieces. 


3 High Temperature Flexible Handling System 


As described before, the boundary conditions of the Tailored Forming Process are excep- 
tional. Thus, standard parts like parallel gripper cannot be utilised. Therefore, a flexible 
handling system consisting of two pin jaws and a grasping device was developed. The jaws, 
Fig. 1b, are the shape variable part of the system and were part of the previous work [6]. The 
grasping device, Fig.2, aligns and moves the jaws and is designed in this work for the use 
case of the forging sector. 


3.1 The Pin Jaws 


In the case of the Tailored Forming Process, the high temperatures are problematic for the 
handling task. For that reason, the system developed in [6] is actuated by pressurised air 
instead of electric motors or hydraulics. Electrical units are not heat resistant, which is why 
they are not considered. Hydraulic systems have to be sealed and the sealings on the pin can 
be damaged by the heat the pin reaches. If the sealing breaks, the system will fail, and the 
fluid will contaminate the environment. Pressurised air also needs sealing but can also work 
without sealing. In the circumstance of a leak, only air is released, which is uncritical for the 
environment. The high temperature also affects the pressurised air, which causes expansion 
that the control circuit can compensate. 

In order to use the pressurised air as actuation for the pins, each pin is integrated into 
a cylinder like a piston-cylinder-system, depicted in Fig. la. The pin consists of the piston 
and a screwable head. The subdivision is required for assembly purposes and allows to 
change the pinheads for different tasks. For example, if the gripper handles objects with 
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Fitting. 


Pin cylinder 
Pin 


Fig.1 a Construction of the cylinder-piston-system. b Pin gripper holding a bearing bushing 


polished surfaces, metallic heads could damage them. A seal is attached at the other side 
of the piston to prevent pressure loss. The seal is critical because it consists of polymer 
material such as the elastic gripper mentioned above. Therefore, a thermal simulation was 
carried out to investigate the influence of different settings for material data and contact 
time between object and gripper. The results indicate temperatures in an allowed range for 
a high temperature stainless steel with a low conduction coefficient. The cylinders with the 
inserted pin pistons are assembled on a base plate, whereby the matrix arrangement occurs. 


3.2 Grasping Device 


The jaws have to be aligned and moved, which requires a grasping device that can withstand 
the heat and water from an additional cooling unit to maintain the handling object’s heat 
distribution, described in more detail in [6]. Due to the temperatures, sealings could be 
harmed and the vaporised water can damage the electrics of the grasping device. To overcome 
these difficulties, a particular grasping device is designed within this work as follows: The 
jaws are mounted on a linear guiding system and can move independently. The linear guiding 
system also has a clamping system that can be activated at every position and fix the jaw. 
Additional double-acting cylinders actuate the jaws by pressurised air. To protect these jaw 
cylinders against heat and the cooling fluid, they are mounted on the backside of a plate 
while the linear guidance systems and the jaws are on the front side. The grasping device is 
depicted in Fig. 2. 
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Fig.2 Components of the pin gripper divided into grasping device and pin jaws 


4 Controlling Concept 


After the presentation of the mechanical part of the handling system, the introduction of 
the developed control unit follows. The gripper’s control has a significant influence on 
the accuracy and reliability of the grip. Therefore, this section addresses the basics of the 
underlying control concept and the gripping routine carried out. 


4.1 Concept 


For the realisation of the control, the following setup in Fig. 3 is chosen: A programmable 
logic controller (PLC) is utilised as a central control unit. On the pin gripper, sensors are 
attached and connected with the PLC. A hall sensor on each jaw actuation cylinder is 
used to measure the jaw’s position. Furthermore, there are laser distance sensors under the 
jaws. Their task is to detect the position of the handling object and transmit the data to the 
PLC. They operate as placeholders in the concept as their final application is still being 
investigated. Due to the thermal radiation of the hot object, their use is not possible without 
further considerations. The PLC processes and corrects position deviations of the object by 
adjusting the jaw positions and the pressure in the pin cylinder, which improves the gripping 
accuracy. Preliminary grasping experiments showed that the pins of one jaw press in the 
pins of the other depending on the handling object. An assumption is that the production 
tolerances cause varying friction forces in the cylinder-piston system. That is why the force 
imbalance between the pins occurs. The imbalance is compensated by adapting the pressure 
value in the jaws and their position. As seen in Fig. 3 the pressures are set by a valve system 
that is connected with the PLC. The valve system measures and adapts the pressures to 
equalise air expansion caused by heat impact. 
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Fig.3 Setup of the control 
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4.2 Gripping Routine 


A three-phase gripping routine is performed to grip a workpiece securely. In the first phase, 
both jaws are moved to the extended position using the hall sensors on the cylinders to check 
the position. Then the pins are brought back to their initial position by shortly applying 
pressure to revise previously set contours. 

In the second phase of the gripping cycle, the gripper is closed. The cylinders first close 
the jaws. Thereby, the closing speed of the jaws is adjusted by the exhaust air throttling that 
controls the airflow out of the jaw cylinder. It is then integrated in the pressure regulating 
valve and can be adjusted continuously. When the pins come in contact with the workpiece 
during the closing process, the pins retract and the pin matrix maps the workpiece surface’s 
profile. The pneumatic clamping system is activated, and the pins are re-pressurised for a 
secure grip. 

In the third phase of the routine, the grip is released. The cylinders are pressurised again 
in the opening direction and the pins are vented. The clamping elements are deactivated, so 
the grip is released abruptly as the cylinders have already been pressurised. All components 
are vented when the defined extended position is reached again. Next, the handling and 
control systems are presented. Then, the gripping accuracy will be validated to examine if 
the required handling precision is reached. 


5 Validation 


The designed system is analysed experimentally to verify the accuracy and provide funda- 
mental knowledge for subsequent system optimisation. This section describes the experi- 
mental setup first and is followed by the results. 
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5.1 Experiments 


The gripping routine presented in Sect. 4.2 is performed to investigate the gripping process. 
The aim is to measure the deviation of the object caused by the handling system and to 
minimise the deviation. The output variable by which the results are evaluated is the relative 
position change of the workpiece during gripping. In all experiments, a hybrid hollow cylin- 
der workpiece with an outer diameter of 62 mm and a height of 84mm is gripped. Two laser 
distance sensors measure the x- and y-position of the cylinder. The x- and y-direction of the 
handling system are shown in Fig. 3 and the set up of the testbed is depicted in Fig. 2. During 
preliminary tests, the closing velocity of the jaws was identified as the main parameter. The 
velocity is determined by the inflow and outflow of air from the jaw cylinder. Therefore, 
the air inlet and exhaust air throttling are considered. For the exhaust, a max. value of 2% is 
chosen. Higher values result in a too fast movement of the jaw. For the air inlet, the range 
from 0.01% to 100% is tested. Initially, a mechanical synchronisation of the jaw cylinders is 
not utilised. Afterwards, a synchronisation is integrated because asynchrony was observed 
in the closing movement. The jaw cylinders are mechanically coupled through a lever. When 
the piston of one cylinder moves, the other is automatically set in motion, whereby synchro- 
nisation of the jaws is achieved. In total, an experimental design with 20 parameter settings 
is carried out. Each experiment is repeated three times in randomised order. 


5.2 Results 


The results are depicted in a box plot in Fig. 4. Here, only the x-direction is evaluated because 
it has a more significant deviation than the y-direction due to asynchrony. The results without 
synchronisation are presented first followed by the synchronised, including the explanation 
of the mechanical jaw coupling. The deviation for an air inlet throttle of 0.01%, 50% and 
100% are high, and a precise handling could not be realised. An exhaust air throttle of 
0.4% has, in every uncoupled case, the worst results. Furthermore, 1.6% and 2% for the 
exhaust have the best repeatability results for all uncoupled settings. Based on the results, 
one can conclude that the exhaust air throttle has a more significant effect on the accuracy 
than the air inlet throttle. During the experiments, observations were made that explained 
the inaccuracies. Depending on the exhaust air throttle, there was a delay in the closing 
movement between the jaws, whereby one jaw reached the object earlier and forced it out 
of the centre. The accuracy achieved with non synchronised jaw cylinders is insufficient for 
the Tailored Forming Process. 

The coupling minimised the deviation drastically as seen for the coupled air inlet throttle 
of 50% (synchronised) in Fig. 4. The best results are achieved for the exhaust air throttle of 
1.6% with a deviation between 0.2 mm-O.7 mm. The deviation has to be optimised further 
to operate in the Tailored Forming Process successfully. 
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Fig.4 Box plot of the experimental investigation. The influence of the parameters on the accuracy is 
tested. Three sets without synchronised jaw cylinders and one set with synchronised jaw cylinders 


The validation provides the potential for further mechanical improvements. Some pins 
do not retract when they come into contact as assumed, whereby the object is moved. 
Investigations show that the manufacturing tolerances do not match every pin. The coaxiality 
of the cylinder drilling and the base plate drilling must be chosen more accurately to prevent 
pin clamping. 


6 Summary 


This work points out the advantages of shape variable grippers and their ability to adapt 
to varying geometries as well as their benefit for the forging sector. Current shape variable 
grippers can not be utilised in the forging sector because they are made of a polymer material 
that has a limited operating temperature of 300°C. The temperature of the forging object 
exceeds this limit by reaching temperatures up to 1200°C. 

Therefore, a pin gripper that is variable in shape as well as resistant to high forging 
temperatures was developed. The pin gripper’s construction and the control to achieve precise 
handling were presented. The controlling is necessary for the automation of the handling 
process and to carry out the gripping routine. 

After implementing the routine, experiments were carried out to verify the accuracy of the 
handling system and provide fundamental knowledge for optimisations. For the experiments, 
two parameters were investigated. First, the exhaust air throttle and second, the air inlet 
throttle. The exhaust air throttle impacts the system more than the air inlet throttle. The 
overall accuracy was insufficient, which is why a mechanical coupling of the jaws was 
installed. This had a positive impact on the accuracy. The experiments showed other critical 
points that must be investigated further, like the manufacturing tolerances. 
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7 Outlook 


The observation during the experiments showed optimisation potential for the jaw’s con- 
struction. The current jaw consists of a base plate with mounted pin cylinders where the 
experiments identified an error source. The base plate and the pin cylinders can be manufac- 
tured as one part to eliminate the error. Consequently, the drillings achieve a better coaxiality 
and the risk of clamping during retraction decreases. 

Furthermore, the routine could be enhanced by involving the sensors for the jaw’s posi- 
tioning in the closing process. Currently, the jaws close until they are in contact with the 
object and cannot move further. In future, the jaws could be stopped before a contact appears, 
followed by pushing out the pins. This would be an additional electronic coupling to the 
mechanical one. 

For experiments under forging conditions, secure and precise handling must be ensured 
beforehand. The experiments have shown where improvements are required, which means 
that the adjustments can be completed and the influence of temperature examined 
in future. 
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Use of Autonomous UAVs for Material 
Supply in Terminal Strip Assembly 


Marius Boshoff, Michael Miro, Martin Sudhoff and Bernd Kuhlenkötter 


Abstract 


The multi-variant small part assembly of terminal strips requires innovative 
approaches for automated picking and meeting increased product variability demands 
with increasing process flexibility. Unmanned Aerial Vehicles (UAVs) are likely to be 
used in material supply for small parts and, therefore, replace manual picking of parts 
that are rarely needed in assembly. An autonomous material supply in the 3" dimen- 
sion could break up fixed assembly processes, reduce picking time and raise the pro- 
duction flexibility. In this article, the share of manual picking time in a real terminal 
strip assembly line is determined, and UAVs as a potential transport solution for ter- 
minals and jumpers are presented. 


Keywords 
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1 Terminal Strip Assembly: The Need for Improving Picking 
Automation 


Today’s manufacturing processes are characterized by an increasing customer-specific 
individualization of products and require new process strategies for batch size 1 to 
remain profitable with increasing process complexity. This market trend is expressed in 
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mass customization and represents a cross-industry challenge for manufacturing compa- 
nies [1]. However, the desired mass customization of products prohibits the use of inflex- 
ible standard automation approaches and requires flexible, economically sensible and 
technically feasible automation systems. Permanently installed production lines, consist- 
ing of fixed conveyor technology, automatic machines, robots and safety equipment, do 
not offer mass customization the necessary production flexibility for a low-effort rede- 
sign of assembly lines and meeting increased customer demands [2]. 

Like many other industries, switch cabinet manufacturing underlies the described 
phenomenon of mass customization, as a wide variety of products can be assembled in 
different ways to meet customer-specific requirements [3]. Switch cabinets are elemen- 
tary machine and plant components, as they are used to distribute, regulate and con- 
trol power using switchgear and other components (see, for example, [6, 7]). Phoenix 
Contact GmbH & Co. KG and the Chair of Production Systems (LPS) have therefore 
been working together closely since 2016 to develop solutions for smart assembly in 
switch cabinet production, including the use of crucial Industry 4.0 technologies. As a 
part of the research cooperation, an assembly system was set up in the LPS learning and 
research factory (LFF). Actual assembly orders are carried out according to the custom- 
er’s needs, and a dynamic technology transfer of new automation concepts from science 
to industry occurs. 

The required small electrical parts have a size of a few centimeters, a weight of 5 
to 50 g and are manually picked from a decentral storage rack. Improving the manual 
picking efficiency is one of the most critical issues in current logistics industry [2]. New 
warehouse automation concepts must be tailored to their particular needs to reduce 
manual picking time [4]. Therefore, the terminal strip assembly will be presented as an 
exemplary field of application for UAVs in automated picking. Thus, the need for auto- 
mated picking will be examined based on the proportion of manual activities for provid- 
ing material like terminals and jumpers in a first step. Subsequently, a concept for UAVs 
in small parts assembly will be presented and discussed. It should be noted that one aim 
of the described method is general applicability in different production domains. In this 
context, the terminal strip assembly serves as an exemplary application. 


1.1 Material Supply of Terminals and Jumpers 


The assembly of terminal strips is carried out in an assembly line. Specific tasks must 
be carried out at each workstation to finalize the assembly process of terminal strips (see 
Fig. 1). Relevant in the scope of this paper are picking processes for workstation 1, the 
terminal assembly, and workstation 3, the assembly of jumpers. Picking parts and pre- 
paring workstation 1 is done according to production order in advance. After the number 
of required parts has been prepared, the terminal strips are assembled directly. Parts at 
workstation 3, on the other hand, are not picked in advance but are procured from the 
storage rack as soon as a container of parts is empty. 
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Fig. 1 Overview of the assembly line [8] 


In [5], the optimization of workstation 1 was discussed, and the implementation of 
a workstation concept was presented for the addressed assembly line. Having the mate- 
rial ready and in place for production supports the assembly workflow, as the employee 
works without interruption and loss of concentration. This, in turn, affects the worksta- 
tion’s design, as it must offer enough capacity to store material and be minimized to 
save production area at the same time. Nevertheless, there is still demand for parts with 
a comparatively low amount that are not stored at workstation 1 but in a separate storage 
rack. The current production process is first examined to determine the share of picking 
time for the terminal strip assembly and thus the loss of value-adding assembly time. The 
first step is to analyze the share of picking time in relation to the total production time, 
divided into the respective work steps of workstation 1 and workstation 3. Therefore, 
actual production data from real orders will be investigated. The methodology of data 
acquisition is examined in Sudhoff et al. 2020 [6, 7]. 

Based on the production data, an ABC analysis is then carried out. Therefore, com- 
ponents with a considerable proportion of the turnover are assigned to class A, whereas 
less needed components are assigned to class B or C. As the scope of this paper is to 
determine the distribution of parts in manual assembly and derive commissioning times 
for each component, the ABC analysis is conducted in terms of frequency of use. For 
class A, a range of 0 to 80% is applied. Class B reaches from 80 to 95% and C from 95 
to 100%. 

Consequently, parts that are frequently used to assemble terminal strips (class A) are 
usually stored within the worker’s reach. Parts that are not commonly used (class B and 
C), on the other hand, are typically stored further away because of space requirements. In 
the underlying example, class B parts are stored in a distance of 8 m, and class C parts 
are located on a shelf 10 m from the assembly station. It can be assumed that this exam- 
ple rather understates the distance for commissioning B and C parts in most cases, as 
demonstrated in [9]. 
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1.2 Time-Consuming Picking Process for Assembly of Terminal 
Strips 


The conducted assembly process analysis is based on evaluated data taken from January until 
November of 2021. In this period, 8,944 actual production orders have been carried out. The 
complete assembly process of terminal strips can be divided into the operating activities of 
production, preparation, administration, rework, and logistics. The activities production and 
preparation account for most of the time spent on the assembly processes and are therefore 
of interest for further investigation. Production describes only work steps, like clamping 
on electric parts, wiring, labelling, or executing quality testing. Preparation refers to order- 
picking required parts, like terminals, jumpers. The necessary parts are picked from a stor- 
age rack, transferred to the respective workstation, and placed for production. Many of the 
components, like jumpers, sometimes need a pre-assembling step. Pre-assembling steps and 
printing of labelling markers are carried out within the process of preparation as well. 

Fig. 2(a) shows the relative proportion of the work steps from the overall assembly 
process. In total, a period of 997.2 h was analyzed, which is equal to 124.7 working days 
of eight hours. 

With 78.4%, the most time-consuming assembly share is the production. How- 
ever, with 17.7%, preparation takes the second-biggest share. Concerning the period of 
997.2 h, the time spent for preparation is equal to 176.5 h or 22.1 working days. Logis- 
tics and rework usually hold a minor share, while administration takes a little more time. 

A detailed analysis of the preparation time for single tasks is presented in Fig. 3(b). 
With 43.1%, preprinting of labelling markers takes the most significant share, while 


2.7— 0.8 0.4 1.6 1.4 


43.1 


m Production = Preparation m Prepare Labelling 
m Picking Terminals 


m Prepare Shipping 
m Rework u Picking Jumpers 


m Administration Logistics 


(a) (b) 


Fig. 2 In (a) time allocation in percent for the work steps, evaluated data taken from January until 
November of 2021 and in (b) time spent in percent for picking of specific parts or preparing an 
assembly step 
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Fig.3 Amount of assembled terminals in 2021 


39.1% are spent on picking terminals. 14.3% of the time spent is used for preparing ship- 
ping material like cardboard, 1.6% for picking jumpers, and 1.4% for preparing end ter- 
minals in an automated robotic clamping application. It may be noted that the observed 
time spent for the picking of jumpers is unusually low. A reason for that was a lower 
need for a greater variety of jumpers in the carried out orders, resulting in less spent time 
for picking those parts. 

The resulting values are used to calculate the timeshare for picking terminals and 
jumpers in the given period. The evaluation showed that 71.84 h were spent for the pick- 
ing of terminals and jumpers, which is equal to 7.2% of the entire assembly time. If it is 
possible to reduce this share of lost effort with a new supply concept, the time saved may 
be shifted to a value-adding activity. 


1.3 Demand of Terminals 


A huge variety of terminal variants were used to assemble terminal strips, but their actual 
demand varies strongly, depending on the order situation. Figure 4 shows the demand for 
terminals in the given period. In total, 56 different terminal variants were used, but only 
some can be listed here. There were 237,245 terminals of the variant PT 1,5/S-TWIN/IP 
assembled, but only 600 of the variant USLKG 5, for example. 

The cumulated amount of different terminal variants is presented within an ABC anal- 
ysis in Fig. 5. Based on a standardized ABC analysis, class A ranges from 0 to 80%, 
class B from above 80% to 95%, and class C from above 95% to 100%. 

There are only 7 terminal variants out of 56 that make nearly 80% of the used parts. 
Conversely, 49 variants must be kept in stock in a separate storage and accessed when 
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Fig. 4 Cumulated amount of terminals by their variant, divided into ABC-categories 


Amount 


Jumper Variant 


Fig.5 Amount of assembled jumpers in 2021 


needed. The same analysis procedure was also used for the supply of jumpers in the 
same period. The demand for a specific variant is shown in Fig. 6, and the correspondent 
ABC analysis is presented in Fig. 7. 

In total, 24,217 jumpers of 25 different variants were assembled. Again, a consider- 
able gap in distribution demand for a particular jumper variant can be observed, which 
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Fig.6 Cumulated amount of Jumpers by their variant, divided into ABC-categories 


means a comparable stocking situation and, therefore, a similar storage situation as for 
the terminals. In contrast to the ABC analysis of the terminals, with 9 out of 25 vari- 
ants, however, a comparatively more significant number of jumpers take a share of class 
A parts. As the jumper’s size is smaller than the size of terminals, the space require- 
ments remain similar for storing A-parts. But there are still 16 variants of jumpers that 
are assigned to class B and C. 


2 Innovative Picking Automation with UAVs 


New picking concepts and automated production units are needed to reduce the effort of 
providing terminals and jumpers sustainably. In this course, possibilities of material pro- 
vision with UAVs are discussed, which are now being subjected to a feasibility study in 
the chair’s learning factory. Integrating an automated UAV-based supply into the assem- 
bly line might open possibilities for assembly concepts like just-in-time delivery. These 
concepts might break up the assembly line to a more flexible structure and even lead to 


reduced production effort. For this purpose, new supply concepts have to be shaped, and 
new technologies must be used. 


2.1 Automation Concepts for the Material Supply of Terminal 
Strip Assembly 


For the assembly line considered, the use of Automated Guided Vehicles (AGVs) is 
unsuitable for the provision of small parts since the AGVs would have to constantly 
avoid employees in a highly flexible working environment or block the narrow paths of 
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the assembly line for the passage of employees [10]. Regarding the flexible nature of ter- 
minal strip assembly, fixed conveyor technology is not considered due to the high effort 
of reconfiguration in every order change [11]. 

Earlier studies proved UAVs to be an underestimated option for the indoor material 
supply [10, 12]. The usage of UAVs as transportation units have neither been tested nor 
documented for an actual scenario in material supply [2, 13]. In this way, manual pick- 
ing effort could be decreased to a minimum. This approach is innovative for small-part 
assembly and meets the need for increasing product variability with increasing process 
flexibility. 


2.2 UAVs for Material Delivery Tasks 


The term Unmanned Aerial Vehicle (UAV) describes small self-flying vehicles without 
any pilot controlling the aircraft. In the field of computer science and artificial intelli- 
gence, mostly the terms UAV, UAS (Unmanned Aerial System), VTOL UAV (Vertical 
Take-Off and Landing UAV) or Multirotor UAVs are used [14]. In most cases, four rotors 
lift the device, enabling the UAV to become a VTOL unit. Besides the Quadcopter UAV, 
there are also Helicopter UAVs and Fixed-wing UAVs. All of them come with their own 
strengths and weaknesses, as stated in [15]. For production environments with limited 
space and high demands regarding safety and reliability, quadcopters are the preferred 
choice as they are most likely to meet the requirements. 

In the course of intralogistics, UAVs are not much discussed yet, though their poten- 
tial might be huge for industrial applications [10]. UAVs promise to be faster, more flex- 
ible, space-saving and more cost-effective than, for example, the material supply with 
mobile robots [12, 16]. The automated supply of workstations transforms the conveyor 
line with fixed routes into a highly flexible, multidimensional material supply. A concep- 
tual configured assembly line for automated picking by UAVs is visualized in Fig. 8. 

The image shows workstations 1 and 3 being supplemented by a loading station (1), 
in which UAVs are equipped with the respective order material. The loading station 
holds the parts of classes B and C that were analyzed in Sect. 1.3, and an industrial robot 
picks the parts from a storage rack to hand them over to the UAV. As material provi- 
sioning is carried out automatically by the UAVs, the loading of UAVs should also be 
automated. In a first attempt, an industrial robot could fulfill the task by picking material 
from a storage rack and placing it in a delivery container attached to the UAVs. Although 
attaching a delivery container or weight to a UAV has been reported in literature [17], 
UAV-loading concepts must be evaluated in real scenarios. Simulations or preliminary 
considerations cannot provide a reliable result. 

The overall system is integrated into the assembly system’s existing architecture, 
involving CLIP PROJECT for task planning and management. The employee could 
be provided with an interface for monitoring and controlling the system. With the help 
of a localization system, for example, an ultra-wideband system (UWB system), the 
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position of the flying robots could be localized and transmitted to the control system. 
Localization is a crucial aspect regarding indoor UAVs due to their inability to use GPS. 
Although indoor localization has been a topic for a long time, it is still an active field 
of research, as can be seen in [18]. For indoor localization, UWB systems have already 
proved to be a reliable localization technology for aerial robotics in numerous studies. 
Besides UWB, motion capture technology is also a reported option in many UAV appli- 
cations [19]. 


3 Conclusion and next steps 


As the picking for terminal strip assembly time analysis showed, small and lightweight 
products with a great demand for manual picking processes are predestined to be sup- 
plied by UAVs. Production data of 8,944 terminal strips were evaluated, and a share of 
picking time for terminals and jumpers was close to 7.2%. Within an ABC analysis, it 
could be shown that many different parts of class B and C cannot be stored directly at the 
assembly station. These parts must be stored in a separate material rack with a walkway 


Fig.7 Supply of terminal strip parts in an automated picking process. (1) Industrial Robot ABB 
IRB 120 in a loading station. (2) Assembly stations for terminals and jumpers. (3) UAVs. (4) 
Dummy for Localization system. (5) Storage rack 
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of about 10 m. Manually picking of parts by an employee results in a direct loss in value, 
as the spent time picking might be used for a value-adding activity in production. 

Due to their low weight and size, an UAV might carry terminals and jumpers, and 
the traditional supply of these parts can therefore be automated. Thus, the first approach 
for a system structure was presented, and a workflow for automated picking by UAVs 
was introduced. Afterwards, existing challenges and barriers for implementation were 
discussed, and research questions were derived. To evaluate the presented approach, the 
system structure will be implemented in the near future. Because the production takes 
place for actual customer’s orders, an isolated test field with the discussed configuration 
will be set up. After the first successful flights, the fulfilment of safety guidelines will be 
addressed. 
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Abstract 


The term Internet of Things (IoT) denotes a communication network, where various 
Things are interconnected using novel scenario-specific Internet technologies and pre- 
defined customizable semantics. Industry 4.0 aims at enabling a globally networked 
production, with the use of IoT as a crucial concept. Since production systems tend 
to be technically and organizationally heterogeneous, distributed at different locations, 
and associated with large amounts of data, communication and information processing 
platforms that provide confidentiality, integrity, availability (CIA rules), access control, 
and privacy are needed. In this contribution, we introduce a concept for designing and 
operating heterogeneous and spatially distributed industrial systems with Digital Twins, 
connected via an IoT communication infrastructure, the Smart Systems Service Infras- 
tructure (S3D). By demonstrating an industrial use case with our concept, it is proven that 
the S3I can be used as a cross-domain solution for the interconnection of devices in a 
distributed production scenario. 
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1 Introduction 


Industry 4.0 is concerned with the digital transformation of industries and is world-widely 
known, especially in the manufacturing sector. In this context, traditional industries are 
going to be combined with novel technologies such as Cyber-Physical Systems, the Internet 
of Things, Cloud Computing, and Big Data to enable a globally networked, personalized, and 
goal-oriented Smart Production [14]. The term Internet of Things (IoT), a crucial component 
at the forefront of Industry 4.0, aggregates various everyday objects to collect, exchange, 
process, and visualize data through the integration of scenario-specific Internet technologies 
and predefined customizable semantics to enable the situation-specific choreography in 
different domains. 

The introduction of Industry 4.0 will inevitably lead to changes in the supply chain [2] 
to respond more flexibly to the adaptation of various technologies. As in the automotive 
industry, a car may consist of more than 30,000 different components produced from different 
raw materials and various manufacturing processes. In this regard, with increasing demand 
for transparency and flexibility in today’s production systems, traditional manufacturing is 
confronted with the transition from a centralized, production-based manufacturing model 
to a distributed, small-scale, and loosely coupled model. 

Distributed production [20], a new form of localized manufacturing, eliminates the need 
for companies to forecast demand and maintain large inventories, and also enables the flexi- 
bility to reconfigure production structures [15]. An important aspect of distributed production 
is interconnectivity among distributed systems and their devices. In this context, how to deal 
with technical and organizational heterogeneity and ensure the confidentiality, integrity, and 
availability (CIA rules) of communication and information processing is turning out to be a 
primarily concerned topic. 

In this paper, we focus on distributed production systems and contribute a concept to 
network heterogeneous and spatially distributed production systems with Digital Twins [13], 
connected via the Smart Systems Service Infrastructure (S3I, depicted in Fig. 1) [3, 17]. 
The S3I is initially developed as an IoT communication infrastructure to interconnect and 
orchestrate the so-called Forestry 4.0 Things [3, 5, 19]. The remainder of this paper is 
structured as follows: Sect.2 summarizes the state-of-art communication architectures in 
distributed manufacturing. A general concept including the requirements is illustrated in 
Sect.3. Its implementation in a simulation-based application is introduced in Sect.4. In 
Sect. 5, the paper is concluded. 
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Fig. 1 The Smart Systems Service Infrastructure as IoT communication infrastructure provides var- 
ious services to interconnect decentralized Forestry4.0 Things 


2 State of the Art 


In this section, we summarize some state-of-the-art communication architecture solutions 
for industrial distributed production systems. 


2.1 Centralized ERP 


Enterprise Resource Planning (ERP) refers to a comprehensive software solution for the 
central management of companies’ resources. As proposed by Thomas Andre [16], the 
integration of an ERP system into the process flow helps the decision-making to be hierar- 
chically broadcast from the upper levels to the lower levels, which is managed in a distributed 
autonomous way to dispatch the decisions explicitly to the respective executor. Similarly, 
George L. Kovacs contributes a web-based solution [9] for ERP systems as flow management 
solutions to manage scalable, multi-agent, multi-company production. 


2.2 Cloud-based Solutions 


Cloud computing uses IT and its associated technologies to drive the digital transforma- 
tion of the manufacturing industry towards on-demand computing services. Following this 
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structure, Xu proposes a layered architecture of a cloud manufacturing system [21]. This 
proposal incorporates a resource layer to deal with static and dynamic resources of software 
and hardware, a virtual service layer in charge of identifying manufacturing resources, a 
global service layer collaborated with cloud technologies, and an application layer dealing 
with user interactions. Rimal contributes architectural requirements [12] for cloud providers, 
enterprises, and cloud users, respectively. These can be summarised as general requirements 
for cloud system design. 


2.3 AAS-based Networking 


The Asset Administration Shell (AAS), a concept associated with RAMI4.0 [7] and regarded 
as the 14.0 equivalent of Digital Twins, can be combined with its asset (e.g. device, machine, 
equipment, etc.) to form a Component to represent all relevant data with a uniform inter- 
face [18]. As a middleware for Industry 4.0, Basys 4.0 is concerned with 1) decentralized 
connection of AASs, 2) Virtual Automation Bus [10] as an implementation of end-to-end 
communication, and 3) service-oriented process control. Using Basys 4.0, Antonino, et al. 
developed an automatic pallet transport system to bundle a high-level control and monitor 
of the status of the system [1]. Perzylo et al. [11] introduce a concept that adopts capability- 
based semantic annotations of existing information models to enrich device models aiming 
at the orchestration of high-level skills from the perspective of BaSys 4.0. 


2.4 Summary 


Despite their successful applications in various industrial fields, the reference architectures 
described above still have several considerable limitations and debatable aspects. Firstly, 
as interpreted by Sun [4], the failure rate of the implementation of ERP systems ranged 
from 40 up to 60%. Furthermore, ERP systems focus on interaction, mainly at the upper 
levels. Hence, they are not able to deal with the events triggered at the lower levels of 
production [16]. Besides, heterogeneity dissimilarities of production systems and lack of 
semantic interoperability make the interconnection even more difficult [8]. The introduction 
of cloud-based technology into the industry raises concerns about sensitive manufacturing 
information. Meanwhile, not all cloud users want to store their data in the cloud and accept 
the security mechanism provided by the cloud provider. The architecture of AAS-based 
networking covers the basic requirements for the RAMI4.0 framework, but its openness and 
secure nature is still a topic for globally networked production systems. 

The use of the proposed concept brings various benefits to industrial distributed produc- 
tion systems. First, S3I combines several standard protocols to ensure the access security of 
everything connected to the infrastructure. S3T’s distributed concept allows all Things to be 
managed without the need for centralized integration and without limiting the storage and 
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management of resources centrally. In addition, S3I accommodates technical and organiza- 
tional heterogeneity and ensures transparent, mutually understandable interactions through 
customizable semantics. 


3 Concepts 


Faced with the shortcomings of the current industrial communication architectures intro- 
duced in Sect.2, we propose in this section our concept to interconnect heterogeneous and 
spatially distributed production systems with Digital Twins, focusing on the aspects of 
secured communication and interoperability by means of the proposed semantics. 


3.1 Requirements 


Our concept is presented under consideration of the following requirements: Authentica- 
tion denotes that the identity of all participants in the IoT must be verified either decen- 
trally or centrally before they are connected to the IoT. Confidentiality emphasizes that 
only the authorized users have the right to access protected resources, especially during 
the exchange of data. Integrity refers not only to the data completeness but also to the 
accuracy and truthfulness of the exchanged data. Data integrity can be ensured by adopt- 
ing e.g. symmetric/asymmetric data encryption approach. Heterogeneity is related both to 
technical and organizational aspects originating from large and time-varying value-added 
networks with different actors. Interoperability refers to a capability of transparent inter- 
connection between all communication participants such as Semantic Data Model [6], a 
common language “spoken” by all participants or a tool to depict the content of Things in 
the meta-level. 


3.2 Digital Twin 


The definition of Digital Twin varies slightly under each emphasis in different fields. In gen- 
eral, everyone agrees that Digital Twins are a 1-to-1 replica of the real world. In this context, 
Digital Twins are continuously updated during their entire life cycle through the internal and 
digital connection to their represented Assets. The interconnection between Digital Twins 
requires a capacity to extract valuable insights from large amounts of data originating from 
diverse devices, services, processes, systems, etc. Hence, semantic modeling, which is used 
to illustrate the relationships between values of data, is gradually taken into consideration 
and incorporated into our concept, which lets Digital Twins understand each other connected 
to them. 
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3.3 From Digital Twin to 14.0 Things 


We define the combination of an asset and its Digital Twin as an Industry 4.0 Component (14.0 
Component). Together with Human-Machine Interface (HMI) and software services, they 
are termed Industry 4.0 Things (14.0 Things), which can be seen as nodes of IoT in charge of 
collecting, exchanging, processing and visualizing data while being networked with others. 
An 14.0 Thing is globally uniquely identifiable, has predefined properties and interfaces, and 
supports standardized services. It can be connected to a goal-oriented Industry 4.0 System 
(14.0 System) that consists of various 14.0 Things. The integration of Digital Twins in IoT 
enables a standardized interface for everything connected to the IoT, making Things as 
accessible nodes. Digital Twins can also be considered as software runtime environments 
that provide a virtual space for data processing and simulation. 

Figure 2 illustrates a simplified Semantic Data Model of 14.0 Things in our aspect, which 
defines uniformly the structure as well as existing properties and callable functions provided 
by 14.0 Things. The data model denotes that each Thing has a unique identity managed in a 
central identity management service. Furthermore, each Thing can restrict the access from 
others and define the access policy, i.e. who can access it with given permissions. It also 
exposes endpoints to the external world, through which Things can be reached to provide 
values (via Property) and service functions (via Functionality). Each Thing can be partitioned 
into smaller but independent Things (via hasSubThings), like a car is composed of an engine, 
four tires, etc. An engine can be modeled as an independent Subthing of a car and provides 


hasFeatures 


hasSubF eatures 


Fig. 2 UML class diagram illustrates the Semantic Data Model applied to model and implement 
Industry 4.0 Things 
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e.g. rpm value and temperature. Furthermore, diverse 14.0 Things can be associated to enable 
the situation-specific choreography, comprising an 14.0 System. 


3.4 Platform 


As 14.0 Things are defined as “worldwide identifiable participants” [18] able to communicate 
and could be distributed over large areas, a central infrastructure with a few essential soft- 
ware services is required to realize a decentralized interconnection of those Things. These 
services facilitate that 14.0 Things are able to authenticate themselves, store and re-find their 
properties and features in a database, and end-to-end compliantly communicate with each 
other considering the given permissions. The S3l as an IoT infrastructure provides directory 
service (via S3I Directory), OAuth 2.0 authentication (via S3I Identity Provider), optional 
message-based asynchronous communication (via S3I Broker using AMQP), and optional 
cloud storage (via S3I Repository). The use of S3I is domain-independent and meets the 
shortcomings enumerated in Sect. 2.4 and requirements listed in Sect. 3.1. 


4 Application 


In this section, we implement the concept mentioned above in a simulation-based scenario 
to demonstrate the communication between distributed production systems, including their 
14.0 components, services, and HMIs. 

The use of the S3I enables different factories that are networked over large areas an inte- 
grated high-level communication. Meanwhile, the interaction at the system level is centrally 
managed by the S3I. The example in Fig.3 illustrates how the 14.0 Things are networked 
with the S3I. In our application, Factory n attempts to get the current production status of 
Factory 1, see Fig. 4. All the Things appearing in this scenario are modeled as 14.0 Things 
using the semantic model presented in Sect.3.4. As an example, Fig.5 depicts the meta 
information of Factory 1 in JSON format. Using the standardized REST API of the S3I 
Directory, Factory n retrieves the endpoint and the interface provided by Factory 1 with a 
valid access token issued by the S3I Identity Provider. Subsequently, Factory n completes 
an encrypted and signed message including concrete request content and sends it to Factory 
1 via S3I Broker. Because Factory n has obtained the access right to Factory 1 previously, 
Factory | is allowed to give an appropriate response. As a result, Factory n obtains the status 
of the field devices in Factory 1 via S3I Broker as well. 

To sum up, the participants in the network are not required to be aware of how other 
Things are implemented, how the internal logic works, how communication proceeds and 
which programming language is used. They only need the corresponding access rights to the 
interfaces to acquire appropriate information and conduct service functions because they 
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ERP/MES App 


Fig. 3 Various Industry 4.0 Things in Distributed production systems interconnect with each other 
using the S3I and its provided services 
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Fig.4 Sequence diagram illustrates how Factory n retrieves the current production status of Factory 
1 using the authentication service of the S3I Identity Provider, the directory service provided by S3I 
Directory and an AMQP message exchange provided by S3I Broker 
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1- 

2 "identifier": "s3i:4711", 

3 “name”: "Factory 1", 

4 "description": "Directory entry for the 14.0 System Factory 1", 
5- “accessRules": [ 

6 - 

77 “permissionPerObject" 

8- 

9 "s3i:4712" "READ" } 
10 

11 

12 

13 > 

14- “endpoints” 

15 "s3ib://s3i:4711", 

16 “tcp://factory_1" 


18- “hasThings" 


19 "$3i:4711-cnc", 
20 “s3i:4711-kuka-roboter", 
21 "s3i:4711-fanuc-roboter" 


Fig. 5 JSON-based meta information of Factory 1 that is based on the Semantic Data Model and 
centrally stored in the S3I Directory 


understand each other by means of the predefined semantics. More importantly, the S3I 
ensures the security of data communication and resources based on CIA principles since 
OAuth 2.0 and role-based authorization policy are used in the central services. 


5 Conclusion 


Faced with the heterogeneous nature of production systems, their spatial distribution in 
different locations, and the trends associated with large amounts of data, a centralized 
infrastructure is needed to connect everything decentrally. We propose in this paper the 
concept of integrating an IoT communication infrastructure in production systems using the 
domain-independent S31, which was developed originally for Forestry 4.0 Things. The pro- 
vided simulation-based example demonstrates the application with a comprehensive method, 
ensuring interoperability in a heterogeneous production network while taking the security 
CIA aspects into account. Besides, the S3I does not limit the resource of 14.0 Things to be 
centrally hosted in the service provided by the infrastructure, but rather decentralized. There- 
fore, from this perspective, S3I can be scaled to any size as long as the server allows. The 
demonstrated application also illustrates that S3I is generally applicable as an IoT solution, 
regardless of the domain. Consequently, the use of S3I could be understood as a promising 
solution for an enlarged and secured IoT. Future work will focus on specific and classic 
security issues, such as DoS, injection, and man-in-the-middle, and analyze the vulnerabil- 
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ity and reliability of S3I under these attacks. Additionally, lightweight data communication 


needs to be considered at the level of communication protocols and semantics as well. 
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